Flyweight: An ORM for SQLite(github.com) |
Flyweight: An ORM for SQLite(github.com) |
const fights = await db.fights.get({ cardId: 9, titleFight: true });
translates to
select * from fights where cardId = 9 and titleFight = 1;
Confession: something about ORMs has never clicked with me.. none of them ever seem simpler than SQL.Where ORMs are useful is once you have your objects. The usefulness of an ORM is being able to say:
user.email = 'new@example.com'
user.groups.append('admin')
user.save()
Also being able to work on your data in objects or structure native to your programming language means that you can leverage both the strengths of the language and the database server. Some operations are much simpler to do in Python or C#, compared to SQL. In other scenarios you really need to let the database do its thing. Again, you do have a point, because believing that you can skip learning SQL and just rely on the ORM will get you into trouble.One other feature of ORMs is that they allow you to write code that will run on multiple databases, but at the cost of not being able to use the more advanced database features. For many CRUD applications that is a price worth paying though.
That example you give here doesn’t require an orm, though. In ColdFusion world they use(d to use?) a pattern called dao, data access object.
why not
user.updateGroups('new@example.com','admin')
Sure you have to write the updateGroups method and use SQL to do it, but that's trival. On the otherhand when you want to do something more complex this is when ORM's inject all sorts of subtle and dangerous bugs into your code base.
I've had some terrible experiences with them...
- during bulk inserts, having to generate a list of values. It'd be wonderful to be able to just supply a single `?`, or use some other symbol to note that it's a value list. Making the user generate a bunch of (?,?),(?,?)... is not at all friendly, and something everyone has to do. and the cherry on top is that there can't be a dangling comma at the end, so it's gotta be chopped off, or omitted.
Not at all a hard problem. It's an annoying problem that I don't understand why it hasn't been solved at the prepared statement level.
- things like database, table, or column identifiers that may be variable based on application context
Basically anywhere that currently winds up getting interpolated should have a way to be parameterized.
without those two, i think it's inevitable to arrive at one of:
- an orm
- a sql query template renderer
- a bunch of functions to do very specific string interpolations
It's much cleaner than generating SQL, and doesn't run into issues with exceeding the maximum number of parameters.
SQL is the usual mode for receiving those relations, and so many ORM toolkits also include query builders to help with that level of abstraction, but theoretically an ORM toolkit could require you to write SQL in the raw.
For instance, you could have a SQL statement like getCardsForFight: select * from fights where cardId = ? and titleFight = ?, and it would generate a class that has a method getCardsForFight(cardId: number, titleFight: number).
With something like this existed for Golang!
I agree that often the full ORM isn't very helpful, I prefer something that's more of a light layer over SQL to enable dynamic query building and that's it.
They are very nice to have, and I'll implement one the first chance I get, but "indispensable" is a stretch, IMO
If you work on a large application with lots of transactional processing, it makes sense. I've worked on apps with 10's of 1000's of lines of Ibatis scripts. And on applications where you're expected to just invoke stored procedures for every little operation. I'll take Hibernate over that situation any day.
People really overstate the pitfalls of ORM's. If you're running reports against a data warehouse, don't use an ORM. If you are trying to optimize a complex query, don't use an ORM. For an app that is write heavy, with very many simple updates to a complicated object graph, it makes sense.
I'm with you on that. I use ORM's, yet, being perfectly comfortable with SQL my perspective ends-up in a range between "this complexity isn't worth it" and "why not just write this in SQL?".
One of the arguments for ORM's is being able to move to different db engines. Frankly, I can't remember the last time I had to do that for a mature/released application.
Once again, I'll admit my perspective is biased because SQL isn't a problem for me. When I look at ORM code it looks and feels very detached from the database. I look at SQL and everything is clean and clear. In addition to that, you don't have to create and manage a bunch of objects that take-up memory and slow things down. Adding layers of abstraction isn't always the best idea.
I tend to agree, I'm more partial towards Micro-ORMs, ideally with a 'natural' DSL; something that is a good blend of 'SQL'-ness and 'target_language'-ness.
I can think of SlickDb (Scala), Linq2Db (C#), Ecto (Elixir) was good if definitely 'elixir-y'... IDK what other languages have such magical things; I know all of the above lean heavily on certain language features to provide their magic.
In the case of SlickDb, while I don't write Scala I read a lot of it, and could always understand what it was doing. In the case of Linq2Db, it just plain has a DSL that is 'close enough' to SQL. I love it and have saved multiple teams from EF with it. Ecto, as mentioned above... it wasn't -bad- but definitely walked the line of 'just write SQL' to me. (I'll admit however I am at best a good elixir app -maintainer-...)
In general,
- 'object tracking' is bad; I have yet to see a good use case where it doesn't wind up masking scope of modifications in the long run.
- 'overly native' DSLs are usually bad; the closer a DSL is to the 'metal' of SQL, the less likely you are to be surprised by the generated SQL (e.x. the 'surprises' of linq2Db tend to be far fewer than in, say, Entity Framework)
- If your language doesn't have a good DSL-providing MicroORM... try to find a basic MicroORM that just takes SQL, and write tooling around it.
for p in Person.objects.raw('SELECT * FROM myapp_person'):
print(p)You are not alone on that sentiment. The opaque nature of the translation often result in poor performance, and to me the apparent ergonomics gains are not worth the trouble either (at least in most cases).
Nowadays we still use libraries to abstract over SQL dialects and generate SQL in a typesafe and convenient way, but it's not an ORM in the sense that it maps from the object oriented domain into the relational one and back.
Your example is a bit disingenuous. A SQL query isn’t native in most programming languages, so you’re missing a lot more boilerplate code
I've yet to encouter a project where I'd need to switch to a different database. Even if that happens, there is likely some raw SQL that someone wrote because the ORM didn't do something as expected. Or some part of the code uses DB-specific ORM features that don't map to different databases. The only thing I can imagine where this would be useful is when you don't have control about what DB is being used, for example, when building a product that should be compatible with Postgres and MariaDB (and each is getting used). However, in the age of containerization, this isn't a big problem any more.
In some ORMs, I need to create types that the result of a query containing JOINs is mapped to. Others don't support them _at all_. In TypeORM, there is a query builder which forces you to put in _some_ SQL for things like "WHERE a in (b, c)". Most ORMs I've used have a cumbersome handling of relations, for example when I need to specify which relation should be fetched eagerly.
I created a proof of concept of a different approach: Just embrace SQL and provide static typing based on the query. The return type of a query is whatever that thing is that the query returns in the context of the database schema. It's possible to do in TypeScript, by parsing the SQL query at development time:
https://github.com/nikeee/sequelts
One benefit is that it does not need any runtime code, as it's just a type layer over SQL. You don't have to rely on some type-metadata that TypeScript emits. That's why it also works with JavaScript only. You don't have to fit every result into some type - it just returns an interface that can be used wherever you want. That's especially useful because TS's type system is structural.
One major downside is that it's rather complicate to implement a parser and evaluation of the result type in TypeScript's type annotations. A different story is debugging type-level code; it's basically try-and-error. Providing error messages in case a query is wrong is also something that needs work. That's why it's only a PoC.
That is the point of Flyweight. The API is very small, and for everything else you use SQL. The ORM parses the SQL to figure out the types, and when you want it to, it maps the SQL into more complex data structures.
I think it's one reason that I'm leaving at the end of next week.
ORMs (or query-builders, as some like to draw a distinction there) become more valuable as you use them to compose and transform queries. SQL is decidedly not composable.
My experience with ORMs is that they’re a minefield of performance cliffs. It’s easy to accidentally generate suboptimal SQL or introduce additional round trips unless you’re very careful about the code you write, at which point you might as well write SQL directly and be at the mercy of one less black box.
This is expanded when migrations are brought into the project.
We can train the data model to have a single management point, that being the file where the ORM classes are defined.
one thing is trying to present an internal dsl for context dependent data access, the other is a generic text api to represent queries and commands.
the analogy would be comparing aws java sdk, and commenting that HTTP rest apis are so simple why would anyone use the sdk?
However, I understand the problem it tries to solve: Object-Relational Impedance Mismatch. I only wish there were other idioms, not the convolution of classes and mappers.
I agree that ORMs are dangerous and clumsy for a number of use cases, but the query-building aspect is indispensable in many scenarios. Unfortunately, the query builder is usually tightly coupled to the ORM, but _if you do know SQL_ then you can use the ORM sensibly and performantly.
But, there was one that I played with that did have appeal to me, RedBeanPHP.
Forgetting that it's PHP for a minute...that's not the main point. It was cool because it had a fluid way of working. It automatically generates the database, tables and columns... on-the-fly, and infers table relations based on naming conventions and how you interact with code. No config files at all.
So, you would iterate in dev solely by writing code, and end up with a schema including foreign relationships. Then, you can "freeze" the schema for prod, turning off all the dynamic stuff.
Their quick tour explains it well: https://redbeanphp.com/index.php?p=/quick_tour
Note: I'm sure it has notable downsides over time, but the approach was really nice starting from scratch.
My experience with ORMs is very similar to my experience with web frameworks. I view them both as a way to offload cognitive burden while you learn about other aspects of the problem space. Once you reach mastery in those other areas, you can begin to dispense with the frameworks and resume more ownership over these areas.
Surrendering a little bit of control up-front makes a lot of sense when you are trying to work through a difficult & new problem. I would definitely prefer the computer do some sub-optimal, blind-mapping of my objects until I could settle on a final schema. Managing a bunch of raw SQL queries while your relational model is still in flux is not something I would look forward to.
The problem with an ORM is that it's a high level abstraction On top of what is ALREADY a high level abstraction: SQL. ANd it's not even a one to one abstraction... they are very different and this actually adds more complexity when it comes to optimizing SQL.
You optimize SQL with hacks to get it to compile into an efficient query. With an ORM you have to hack the ORM in order to hack the sql in order for it to compile into an efficient query. It's nuts.
The fix for this problem is to not use an ORM. You want to use in language primitives? Make an abstraction that is one to one with SQL. A library that gives primitives that directly represent SQL language primitives. That's what we actually all want. We don't actually want an orm.
Which SQL?
When the types are coming from an external source that may or may not be available at compile time, I can't think of any way to prevent codegen and also retain type-safety. Some additional integration with build system will be needed.
The other tools I have seen that try to figure out SQL types do not accurately create types for things like left joins.
I don't really care about the query, just that it's inserted correctly, along with other operations that an ORM provide (get/delete/update,etc)
As for the ORM debate, not applicable for SQLite but if m using a database with better support for stored procedures (like sql server, postgresql or oracle), i just prefer a minimal DAO or an ORM that just built from stored procedure calls.
Unfortunately sometimes, specially in a big diverse team we do prefer an ORM so devs focus on other things a let the DBA guys try to guess why my code calls select everytime it refers to "entries" object just to get one entry.
https://github.com/Aperocky/sqlitedao
https://pypi.org/project/sqlitedao
Same concept, huge speed boost to personal projects. ORM is great because you can abstract items in memory directly into persistence, and define the relation in programming language instead of SQL.
I generally reach for TypeORM and have tried MikroORM lately but didn’t really like it.
But what I really want is something like slonik which is more focused on querying than relational mapping.
[1] ts-sql-query.readthedocs.io/
It uses Knex as the query builder so maybe you can just use that directly: https://knexjs.org/
Although, I would use Mikro still to manage schema and migrations. Then drop down to Knex: https://mikro-orm.io/docs/query-builder#using-knexjs
[1] https://blog.codinghorror.com/object-relational-mapping-is-t...
Simple ORMs that map columns in rows to attributes or properties in an object are fine. ORMs that handle complex relationships and migrations and the rest (a la Entity Framework, Hibernate, ActiveRecord), are all pretty much a vote of no confidence from me in any project.
include: ['posts']
in SQL, you will write join posts on p.authorId = a.id
I would argue an ORM is more about getting the flat structure of the result set into an hierarchical set of objects with more complex types.Jokes aside, as someone who wrote a python sqlite ORM (shameless plug: `pip install sqlitedao`), my reason was to have a minimal ORM for personal project, the entire active source is contained in one file and it works for majority of the use cases (i.e. insert_item, get_item, etc).
Also tests: https://gist.github.com/johndoe46/17eacf0f12772dfb870732479b...
(This is proof of concept quality, don't use for real work)
But .NET also has Dapper where it lets you write all the SQL and then it just handles the binding of data into objects, which having that handled for me is great.
I mention because I had something of the opposite experience with it. It not only ended up yielding the correct queries, but I saw a significant increase in performance. And the neat thing about it, beyond ORM and linq-to-sql, is a common interface amongst providers - so you can do things like swap from SQLite to Postgres with 1 line* of code, so long as you're not using provider specific extensions.
But, yeah, that's not the normal path.
EDIT:... well... I see it's still got an aim of being PHP5 compatible. Upgrades for PHP 8.1, but that seems off a bit. I seem to remember composer was an issue... and there still is. Impressive that the author is still evolving and supporting it :)
Either way the answer is this:
All the popular versions of SQL. I mean what else could the answer be? Your question was rhetorical. Trying to expose a flaw.
See the sqlalchemy expression language in sqlalchemy core. It's already very similar to what I'm proposing. It exposes something that is one to one to ansi sql. Then the underlying implementation translates that expression language into different sql languages. This is the key: Underlying implementations of ORMs typically ALREADY handle all versions of SQL languages.
What I'm proposing is a realistic extension of that. Separate APIs for EVERY sql language. You already have separate implementations... thus separate APIs is not such a crazy unrealistic extension of that.
All sql languages have common language primitives, thus an API could have two sets of libraries. One for common primitives and the other for language specific primitives. But that common library can end up being a rabbit hole, so if it's possible to do cleanly... sure, but if not then just seperate APIs for every sql language is fine.
I mean does code really need to be portable across sql databases? I wrote ORM code for postgresql, do I suddenly need that code working with SQLlite? It might be a convenience, but it's not a huge requirement.
Person.objects.raw('SELECT * FROM myapp_person WHERE last_name = %s', ["Doe"])
Should you want to work without your models... well, you still have access to your connection object and can happily fetch whatever column or aggregate you want. insert into your_table(id, created_at, uri, project_id)
select id
, created_at
, endpoint as uri
, project_id
from jsonb_to_recordset($1)
as x(
id uuid
, created_at timestamptz
, endpoint text
, project_id uuid
)I'd rather simply understand how my ORM generates queries, and then use the ORM to get the full power of my language of choice in creating the right SQL queries.
Depending on the ORM this may be be far from trivial. Despite being a C# dev rather than a DBA I can say that I have a better understanding at least one DBMS's query planner than what Entity Framework will generate for certain queries [0].
I'll note that -good- MicroOrms can make 'subqueries' in a fairly composable way.
I think with a manual SQL, you can still make a 'good' mini-DSL with subqueries, but it will take some thought to do well. Actually, with -extremely- thoughtful design it can be far more performant and productive, but IMO the complexity would need to show ROI to be worth it.
But in -either- case, in the long run the restrictions help with perf and maintenance. Whether it's hand-written mini-DSL or a good MicroORM DSL, you have a much better mechanical sympathy for what you're doing to the DB, and it becomes easier to write Table designs that are performant.
> and then use the ORM to get the full power of my language of choice in creating the right SQL queries.
Now you have to learn an ORM, a Query planner, -and- how to make them play nice.
[0] - There has been more than one shop where I was the 'EF Expert'... take that as you will.
For one, you're creating a hard coupling to a specific flavor of SQL. And that's not too mention the fact that you're taking an otherwise purely data object and embedding persistence logic into it - a horrifying abuse of OO.
You're just trading one coupling (specific flavor of SQL) to another (your ORM.)
Assuming your application is layered correctly, when you write your own queries, all of your SQL queries are in a single place and can be updated.
BUT: If you're using an ORM, and you let your data bound objects leak into all layers, the coupling is much much much harder to fix if you chose to change your ORM. IE, if you do things like lazy loading, or construct your queries in business logic, switching ORMs will be extremely painful.
I've done it both ways (write my own SQL and use an ORM) and I would say the single biggest mistake is to assume that you absolutely should (or shouldn't) use an ORM.
Adding persistence logic to a data object adds all kinds of bloat - it has to have a connection to the database, which now makes unit testing a pain in the neck and introduces all kinds of weirdness around serialization. Now it has a bunch of CRUD methods, so developers have to know which methods are for business logic versus persistence. Also, data objects with persistence logic aren't really suited to be published in a shared library as clients should not access your database directly.
All around, it's just a really terrible idea.
I fail to see why this is a problem. Switching databases is a costly move, and is pretty rare as far as I know. When it does happen, it is usually from one type of db to another type, not between two RDBMSs.
IMO it doesn't, by itself, justify sticking to an ORM rather than raw SQL.
That's not what you do if you write your own queries. That's what a novice does after learning object oriented programming.
For a more accurate example, look at things like Hibernate / NHibernate / Entity Framework and lazy loading. They inherit from classes at runtime and will transparently run queries as business logic navigates relationships on an object graph. It's "not wrong," but it can lead to all kinds of problems and painful refactors. (But it's totally "worth it" in prototypes, one-offs, and throwaway code.)
If you don't do lazy loading, Entity Framework still requires that you pre-declare which relationships you will traverse. It's intended that your business logic uses Entity Framework APIs to say what part of the object graph it will use. (Thus tightly coupling your application to your ORM, which means your trading tightly coupling to a database to tightly coupling to a framework / library.)
But, keep in mind that the ORMs I mentioned tie the objects to the database connection. It isn't quite as intense as "Adding persistence logic to a data object", but they do track that, if you modify an object, it can only be saved on the transaction that it was loaded from; or if you're using lazy loading, lazy loading only works if the transaction is still open.
If you build a layer around your data access code that fully shields your business logic from your ORM, you've "done it right." But, at that point your ORM's value becomes negligible, because from the perspective of your business logic, it doesn't matter if your data access layer has hand-optimized SQL or an ORM; but you've lost one of the real selling points of an ORM, which is that you can easily do your data access from within your business logic. Which is why I say that the biggest mistake is assuming you should, or shouldn't, use an ORM.
Second, you just tried to argue for manually writing your own SQL rather than using an ORM by suggesting the way NHibernate (an ORM) does it is the right one. Might want to think up a new argument.
Finally, I used NHibernate in the past but it's gone out of favor because of the problematic things it does. Those proxies have to be runtime generated (startup cost + some extra overhead for each query) and are terrible for your data model. You have to mark properties virtual and suddenly GetType doesn't return the expected value. State tracking is actually not a great idea - causes many people to do dumb things like querying just to perform an update.
Entity Framework with state tracking turned off is better all around.
I don’t disagree with the overall point you’re making, I’m just pointing out that it’s very common
Even if they did the same thing you're suggesting, by that logic, we should all use PHP because WordPress does, and it's probably more popular than those two combined.
Take a look at something like Spring or Entity Framework to see other takes on ORM patterns.
And again, you're trying to argue against ORMs, while citing that your suggested alternative is supported by the way ORMs do it.
I don’t see what the nuance is you’re getting at with the mixin vs object method argument though. They both have the same issues around testability and violation of SRP.