How Postgres Triggers Can Simplify Your Back End Development

How Postgres Triggers Can Simplify Your Back End Development(themythicalengineer.com)

93 points by sks147 3 years ago | 110 comments

My only hesitation with methods like this is it ends up splitting the business rules into two places, where one is sort of obscured.

It's obvious to look at `add_new_payment` for the code that runs when adding a new payment, but then the code isn't there, so you have to know/ask or search in either migrations, a fresh structure dump or poke at the actual db (!).

I think they're great for other, well, effects when needed. PostgreSQL is a real powerhouse.

throwaway173738 3 years ago | |

Yeah I was expecting the post author to discuss the trade-off being made here because it’s really important to do so. The biggest complaint I have with these pithy articles is that they try to sell you on a particular trade-off without explaining what the deal is. It makes me think the author:

1. Just discovered this. 2. Implemented a bunch of them. 3. Hasn’t been maintaining this solution for more than a couple of months.

This is a trade-off that only manifests itself after maybe a couple years when you’ve built a system and you have many hands maintaining it. Either there’s some performance problem that makes it worth it or you’ve just obfuscated half your code for no reason. In the latter case those many hands are going to wreak havoc on your system before you figure out how to make it maintainable.

quadrifoliate 3 years ago | | |

> The biggest complaint I have with these pithy articles is that they try to sell you on a particular trade-off without explaining what the deal is.

This is basically the clickbait in the wider world affecting software development, even though it might not look like it. Boiled down to the essentials, we are telling each other the software version of "Here's How to Lose 10 Pounds in Time for Summer." way more often than "How To Balance Diet and Exercise to Remain Healthy Over A Decade".

But it's up to us to those of us who like more discussion to reverse these trends. In that vein -- do you have good sources that typically talk about tradeoffs with technology rather than promoting a specific one? Kleppmann's book on Data Intensive applications is one I have found in the past.

cgh 3 years ago | | |

I'm not sure if the post was updated after your comment, but from the article:

"Triggers should be used with caution since they can obscure critical logic and create an illusion of automatic processes. While this can be advantageous in certain scenarios, it can also pose a challenge in terms of debugging, testing, and monitoring since they are not readily visible to developers."

eastbound 3 years ago | | |

DB upgrades are a pain. Triggers must be deployed with “UPDATE” instructions, it’s not as easy as keeping code in git.

simonw 3 years ago | |

I think this problem can be robustly solved if you have the right mechanisms in place:

1. Migrations. Your schema needs to live in version control, and changes to your schema must be applied by an automated system. Django migrations are the gold standard here in my opinion, but you can stitch together a custom system if you need to, one that tracks which migrations have been run already and provides a mechanism to apply new ones.

2. Automated tests. Your triggers MUST be covered by automated tests - call them unit tests or integration tests depending on your preferences, but they need to be exercised. Don't fall into the trap of mocking your database in your tests!

With these two measures in place I'm 100% comfortable using triggers that split my business logic between my application code and my database code.

OJFord 3 years ago | | |

I don't think migrations (at least as done by Django et al.) solve it - you want a declarative source of truth for what the schema looks like today, not a chain of changes that only tell you that after computing the combined effect.

Even if they just created a generated file of the final schema, that sat in version control and errored the makemigrations check (just like a missing migration) if it was out of sync, that would be a significant improvement IMO. But I think the Django maintainers at least would say they want the ORM DSL to be that. (But it's way too incomplete, you'd be limited to a tiny subset of postgres, and even then you have to be on board with that being a reasonable description of your schema, not wanting the actual db schema anywhere.)

javajosh 3 years ago | | |

> Django migrations are the gold standard here in my opinion

Intriguing. In JavaWorld it's really only Flyway and Liquibase, and the important self-imposed rule, "Thou shalt never remove a column from a table."

ysavir 3 years ago | | |

But what problem is moving to triggers solving? Is keeping all your logic in a single place actually a problem, or is moving part of it to Postres tempting because it's nifty? How much training will you have to provide for new hires to be able to run with Postgres triggers instead of just Django logic?

AmericanChopper 3 years ago | |

Table triggers are the ultimate foot gun in this respect. They are highly obscured! Stored procedures split your business logic too, but when you want to go and look at your database logic, it’s at least where you’re expecting to find it, and not built into a table.

I’d highly recommend people avoid them, unless you feel that you really need them _and_ you have very robust development processes. As soon as you deploy your first table trigger, from that moment you have to check every DML statement for unintended side effects.

marcosdumay 3 years ago | | |

> from that moment you have to check every DML statement

If that's the case, you have a documentation problem. It should be easy to decide if there are side effects or not just by looking at the DML and the metadata you need for it.

In fact, the case here is that the trigger on the article is an incredibly bad one. It's not a natural consequence of the table, or the database structure. It's probably not even always true to the business rules, what is the one property triggers must have no matter what.

whstl 3 years ago | |

Business rules are already often in multiple separated places in a codebase. If you mean "it should be in the same git repository", then triggers should be already in migrations.

But IMO, even migrations are a stopgap solution to this problem. In some cases, such as in ORMs like Rails' ActiveRecord, even the column names aren't present in the model by default, they are only migration files.

The real solution is putting the definition of the triggers (and all other database things too) closer to the code that operates on them. In models, for example. And models should enforce that only the defined triggers are present in the database. Of course, since most ORM users aren't really big DB users, this isn't exactly common. Until we get features like this, I agree that triggers can be confusing.

cgh 3 years ago | | |

If you are using some JPA implementation, eg Hibernate, then JPA events are more or less what you are talking about except JPA can't enforce the condition that JPA-defined events are the only triggers present. If you write custom triggers then you run the risk of stale entity data being presented to the user.

But otherwise, this fulfills the condition of triggered events residing in the codebase. Of course, this means you have to be comfortable with using an ORM in the first place.

brightball 3 years ago | |

You can say similar things about any API based development though. Triggers are just a part of the database API.

Databases exist to manage data, not just store and retrieve it.

avereveard 3 years ago | |

it is also a very restrictive environment, you either only trigger on a single in transaction table, or you risk having triggers tripping other triggers, limiting the approach scale anyway. at which point a well managed transaction from an active record or a data gateway will do miles better.

citrin_ru 3 years ago | |

You have exactly the same problem in microservice architecture while calling a remote endpoint.

Aqueous 3 years ago |

Why is this the top story? This is a major foot gun. Don’t write business logic in the database. You may think you are simplifying things but in fact you are making them more complex.

Instead adopt a solution for structuring your business logic in a sane way, such as using a workflow engine. Your code will become simpler and well organized that way without creating a tangled web of distributed rules, as well as exist all in one place.

masklinn 3 years ago | |

> Don’t write business logic in the database. You may think you are simplifying things but in fact you are making them more complex.

Alternatively, write all the business logic in the database. This way you can better leverage the DB features and ensure that logic only needs to be written once.

Scubabear68 3 years ago | | |

I have worked with systems with nearly 100k lines code in Oracle stored procs, and another legendary place with around 4 million.

It was a nightmare. Deployments were very difficult, there was little tooling, reasoning about the system was difficult, and of course running so much code in Oracle required very expensive licenses.

And it is much harder to hire hard core PL/SQL devs over Java, C#, Python or whatever.

You really don’t want significant code in the DB. It can be useful for some cases, like automating audit tables, but that is about it.

And for triggers - I feel for anyone maintaining business logic located in triggers. What a debugging hell that can be.

ysavir 3 years ago | | |

I worked for a short while at a place that tried following a similar dogma. Hiring was incredibly difficult, as was retaining people (such as myself). Writing business logic in code instead of DB functions is much more approachable than keeping it in the DB.

majkinetor 3 years ago | | |

Using SQL for business logic is probably worst then using Cobol in 2023 - good luck with automated tests, debug, documentation etc. Yes, I know it all technically can be done but you relly need to be masohistic and mindless bastard in order to do so.

Rather, lets rejoice with latest C#, zig, rust, PoweShell or whatever-beautiful-language-and-ecosystem-perfected-in-latest-decade we have, instead of horrible SQL.

airocker 3 years ago | |

You don’t have to write business logic in the db. You essentially listen for notifications from workers who implement the business logic depending on the db changes. It is essentially like a poor man’s message queue for us.

chucke 3 years ago | |

Workflow engines are neither simple, nor they are widely available for all programming languages.

SaltyBackendGuy 3 years ago |

We use PG Notify at work extensively and it's the source of a lot of pain and suffering. Not because of the functionality itself, more so because what we did to ourselves by using it in the way we did.

I think this could be great for certain projects, but there is a lot room to put yourself into a situation that's hard to maintain if you/you're team doesn't possess the right amount of discipline around documentation, developer tooling, observability etc..

avinassh 3 years ago | |

> more so because what we did to ourselves by using it in the way we did.

what was the cause of pain?

asim 3 years ago |

What's old is new again. In 2007 we were using triggers and stored procedures heavily with mysql and a java app. Unfortunately we were also reliant on read replicas. Some of this replication behaviour did not translate well with the mix of these functions and auto incrementing IDs. Sometimes it would result in foreign key constraint violations and all of a sudden our replication would stop. This was even worse when we tried multi master setups. I spent years dealing with this. Ultimately we dropped the use of the most complex queries and shifted them into code which made the replication more stable but at the cost of Dev time.

Morale of the story, use it with caution. I know postgres is different but when you start turning you database into a ball of mud things get dangerously difficult to debug and fix.

SoftTalker 3 years ago |

90% of the time this kind of thing should be done in stored procedures, not triggers. You know when you are calling a stored procedure; you cannot do it accidentally. Triggers can cause things to happen "by magic" if you aren't keenly aware that they are there. They also complicate large updates.

Triggers to do something that's simple and always required, e.g. updating a primary key index for a new row (before autoincrement was available) can be OK, but use them sparingly.

I like putting business logic in the database, because you only write it once and not for each client application. Client applications and platforms and their development languages come and go a lot more frequently than databases. But I use stored procedures almost always, and rarely triggers.

chank 3 years ago |

I feel like we're coming full circle 30+ years and have to re-learn the perils of db triggers. When to use them and when to not. These days most of the db has been abstracted away from developers using ORMs so it must seem like the discovery of something new when in fact we already know.

Don't write business logic in the datastore.

charles_f 3 years ago |

> Triggers should be used with caution since they can obscure critical logic and create an illusion of automatic processes

Summarizes why in my opinion using triggers is rather risky and confusing. You introduce side effects to operations that one might suspect are CRUDlike. Your code is made non-atomic, in that you need knowledge of what happens elsewhere to guess why it's behaving a certain way. On small projects it's rather tempting, but small projects become large projects that then get given to someone else to maintain, and 4y later someone will spend a week trying to understand why the amount column gets updated to another value that they're pushing.

The only use that I find safe is for database metadata, say if you're using triggers to keep track of write origins, or schema metadata. For everything that's business logic, I'd stay away from them

thinkingkong 3 years ago |

I love these types of techniques. Need a basic no nonsense queue? Postgres. Need a basic reporting infrastructure? Postgres. Need a document store? Postgres.

But every single time this comes up, people on the engineering teams Ive been on all throw their hands up and accuse folks of overengineering or underengineering. You need rabbit or kafka. We should move to mongo. Etc.

Thats the part thats hard.

pgthrowaway3 3 years ago |

'Ate Mongo

Luv Postgres

Simple as

A quick note for anyone thinking about triggers: if there's any case whatsoever that you're going to have more than one row insert into a table per transaction, please use statement level triggers -- especially if you're doing network calls inside the trigger. Triggers execute within the same transaction, they're synchronous, and will soak up resources.

Hell, if you're using network calls in your triggers... please don't. Use `LISTEN/NOTIFY` or queues to offload that to some other process (whether postgres itself or another process), so PG isn't left hanging around waiting for a network return.

brightball 3 years ago |

There’s so much value in database triggers when they’re done right. Anything where live stats are needed gets a whole lot easier with triggers rather than counts.

trollied 3 years ago | |

Then you discover Materialised Views... :) Magic.

switchbak 3 years ago | | |

Just wait until we get Incremental Materialized Views!

nsilvestri 3 years ago |

The article mentions it at the very bottom, but I almost never reach for triggers because they are obscure places to put application logic. On more than one occasion I've been burned by not realizing that the code in the backend did not represent the whole picture of business logic. It's more complexity, requiring more documentation, adding another point of failure that probably isn't necessary.

suchar 3 years ago |

One major disadvantage of triggers is the inability to do canary deployments and vastly increased complexity of rolling deployments. When SQL code lives within the application, we can trivially run multiple variants of such code simultaneously. Running alternate version of a trigger for e.g. 10% of traffic is way harder.

What I would recommend instead is making use of CTE (Common Table Expression), because DML (modifying queries) inside `WITH` are allowed and taking leverage of `RETURNING` keyword in both `UPDATE` and `INSERT` we can execute multiple inter-dependent updates within single query.

With such approach we can trivially run multiple versions of an application in parallel (during deployment, for canary deployment etc.) and we have similar performance advantage of a single roundtrip to database. Additional advantage is the fact that there is only one statement which means that our query will see consistent database view (with very common read committed isolation level it is easy to introduce race conditions unless optimistic locking is used carefully).

tmarice 3 years ago |

I wouldn’t call this “simplified”.

Personally, it’s much more valuable to have all business logic in one place, in a single language, available at a glance. The perforance gain isn’t worth the increased complexity in codebase.

mirekrusin 3 years ago | |

You just unintentionally made your problem distributed which is can of worms and you'll find out later when your project is successful in production.

jaxr 3 years ago |

Wasn't this something that Oracle pushed aggressively like in the 80s or 90s and then everyone agreed it was a maintainability living hell? Is this a thing again for some reason I'm missing?

Clubber 3 years ago | |

It's good for vendor lock-in.

spprashant 3 years ago |

Triggers and stored procedures should only handle logic which you do not expect to change. They are notoriously hard to test and debug.

They are coupled tightly with database schema design. Making changes to them needs careful consideration and a extensive testbed environment which cannot be mocked with a fraction of the data.

Southland 3 years ago |

I worked at a company which relied on significant use of Postgres triggers and it was not simplified in my mind due to:

- Engineers being more comfortable expressing the required business logic in the other languages they were working in then PL/pgSQL

- Challenging to write tests for the triggers

- Harder to deploy variations for testing if needed

ForHackernews 3 years ago | |

You can write tests using pgTAP https://pgtap.org/

mberning 3 years ago |

If you plan on never changing away from postgres, never having to shard, never needing to do anything that a trigger can’t support, then it is a good option. Which may be true for the vast majority of the apps. You also need sql expertise in addition to app dev expertise.

kayo_20211030 3 years ago |

All true, but still a bad idea. Splitting your concerns across multiple systems will eventually bite you. It's too hard to reason about, and unless you want to make some very hard yards, there's not even sensible source control.

idlephysicist 3 years ago |

A colleague of mine talks about the Law of Conservation of Complexity. It boils down to "the complexity will have to go somewhere".

You can make the development of your backend more simple, by shoving the complexity into the database, meaning your backend just does less. That in itself does not make your application any simpler.

xept 3 years ago |

For Django there's https://github.com/Opus10/django-pgtrigger that makes it possible to define triggers right in your models, so you have everything in one place.

sabzetro 3 years ago |

Business logic in the database screams anti-pattern to me.

How do we know who created the rule, edited the rule? How can we reason about the sequence in which these rules are executed based on larger use cases with complex interactions.

Seems like a fire waiting to happen.

bob1029 3 years ago | |

> How do we know who created the rule, edited the rule? How can we reason about the sequence in which these rules are executed based on larger use cases with complex interactions.

You are presumably operating inside of a database, a place where the above concerns can be tracked in ~3 additional columns. More complex rule arrangements can be addressed with additional tables & relations. If you are starting from a blank schema, everything is possible. As noted by others here, you either go all-in, or all-out. The middle ground where half the logic is in the database and half is in GitHub is where things get yucky.

Consider the simplification angle. There are some techniques that allow for running entire apps directly out of the database. You might not even need Node, .NET, Go, Rust, etc. Hypothetically, if 100% of the things are in a database, you can simply record the binary log to S3 and have a perfect log of everything over time. Imagine how easy it would be to set up a snapshot of a given environment on a developer machine. Inversely, you could directly ship a developer's machine to production. You can also do some crazy shit where you merge bin logs from different timelines via marker transactions.

The other major advantage includes being able to update production while its live, even if production is running on a single box. I saw a particular PL/SQL install earlier in my career that was utilized for this exact property - production literally could not ever drop a single transaction or stop servicing them. Latency beyond 2 seconds could be catastrophic for system stability. Production could come down, but it had to be all or nothing. Think - shutting down a nuclear reactor and the amount of time you are locked out due to the subsequent safety and restart checklists. You absolutely need a way to safely & deterministically update in-between live, serialized transactions or you can't run your business effectively.

SinParadise 3 years ago | |

I'd say depends on the complexity of the logic itself. I would never write triggers with any logical branching, but for simple update table B when table A is updated? I definitely see the value in that.

runeks 3 years ago |

I'm torn on this subject. It's not a simplification in my view, but just one way to achieve a goal that has pros and cons.

The big pro is that you no longer need to remember to update tableB, which is derived from data in tableA due to performance, in your application code every time you update tableA.

The cons are that:

* You add more state to your DB

* You can't express the logic in your backend language

Thinking more about it, I don't think the cons outweigh the pros. I would prefer this trigger logic to be part of the DBMS, so I can express the logic in my backend language and also avoid increasing the dependency of my application logic on DB state.

polishdude20 3 years ago |

One of the reasons we use database triggers is that we have a legacy system running on Rails and a new system in Typescript. The old system has an entity that is similar to the new systems entity but a bit different. While in this limbo of sunsetting the old system, we have triggers on the old entity when it changes to update the new entity. The thing is, these triggers invoke a lambda which does the business logic for migrating old row to new row. We could also have the old system maybe make an API call to the new system and skip triggers altogether.

phendrenad2 3 years ago | |

Using a trigger to kick off a lambda seems like the way to go. You're essentially doing what the NOTIFY command does.

tuyguntn 3 years ago |

It's easy to start with when your project is small, especially for quick fixes and improvements (e.g. total # of orders made by user), later it will become a mess and makes your project difficult to maintain, because you usually don't test your database in unit tests and database migrations are still a thing (from one db to another, from one type of columns to another and so on)

spprashant 3 years ago |

> You should consider the fact that complexity will still be there, but it will be abstracted away inside the database.

You have got to be kidding me.

tonfreed 3 years ago |

I used them once when I was a young engineer, then quickly realised how much I hated them because of how much they obscured the logic of my application. I hadn't even started to worry about migrations at that point.

Been first in all my teams since then to loudly voice my opposition when someone suggests it as a quick fix for something more complicated

wgerard 3 years ago |

Heh I vaguely recall at Etsy, predating my time, that a significant amount of business logic was done using stored procedures and triggers.

They migrated away from it at some point, but some of the people who handled that migration were still around when I was there. Didn’t sound fun at all, sounded like a horrific nightmare.

airocker 3 years ago |

We use triggers and notifications extensively.!it is great because we don’t have to run a message queue . the only concerns are that notification has a size limit that is quite small. Also, it is harder to implement multiple workers who get only some of the notifications to load balance.

ukd1 3 years ago |

Using triggers when you have a single-codebase is prone to obscuring where things happen - is it in code, or in a trigger somewhere? However, when you have multiple different codebases touching the same db, it can be great at enforcing things to happen in the same way across these.

rr808 3 years ago |

We have a trigger that is 1400 lines long. Try debugging that when there is a problem.

_a_a_a_ 3 years ago | |

Sounds like your problem isn't the trigger per-se

tgv 3 years ago |

And then someone does something to the database that disables the triggers. Or just one of them. Or someone adds the same trigger a second time. It requires a lot of discipline to keep it sane.

jtokoph 3 years ago |

Not about the core of the article, but how often are folks using something other than the id column as primary key and having id be a foreign key? This seems extremely confusing to me.

jimnotgym 3 years ago |

>You should consider the fact that complexity will still be there, but it will be abstracted away inside the database.

Where Git isn't looking...

_a_a_a_ 3 years ago |

Or maybe just use stored procs, which may be better for the wallet example here.

harha_ 3 years ago |

I don't like the idea of moving application backend logic to the database.

revskill 3 years ago |

When you have one problem, you decided to use Postgres Trigger.

Now you have two problems.

signalioto 3 years ago |

That's just written to trigger me right?

Right?

hoki718 3 years ago |

Betbola138