SQL:2023 has been released

SQL:2023 has been released(iso.org)

270 points by MarkusWinand 3 years ago | 133 comments

MarkusWinand 3 years ago |

The major news are:

- SQL/PGQ - A Graph Query Language

- JSON improvements (a JSON type, simplified notations)

Peter Eisentraut gives a nice overview here: https://peter.eisentraut.org/blog/2023/04/04/sql-2023-is-fin...

dang 3 years ago | |

Discussed here:

SQL:2023 is finished: Here is what's new - https://news.ycombinator.com/item?id=35562430 - April 2023 (153 comments)

bokchoi 3 years ago | |

Thanks! Lots of little neat improvements in there like accessing JSON values using dots and array syntax:

    SELECT t.j.foo.bar[2], ... FROM tbl t ...

sverhagen 3 years ago | |

I'm a pretty average SQL user, but I've heard expert consultants say before that they could do many more things with SQL databases that developers like me would have maybe grabbed a different tool for, like a graph database. So this then makes me wonder, once there's even broader adoption through PGQ, is that going to be a killer for niche databases like Neo4j, in favor of, say, Postgres?

derefr 3 years ago | | |

Graph databases are about as different from RDBMSes storage-wise, as column-stores are from row-stores. It comes down to how you plan to shard data and distribute queries when data doesn't fit on a single node.

Using a graph DB with many underlying KV-store nodes, you can have a single graph spread over many machines representing e.g. Facebook's social graph, and run a query which "chases around" edges between vertices that live on different nodes, to solve that query, while ensuring that as little of that has to happen as possible — both by rebalancing vertices so that data is sharded at low-connection-degree points in the graph; and by consolidating the steps of queries that occur on the same node into single batch queries, such that the whole thing becomes (close to) a single map/reduce step.

There's nothing in Postgres that knows how to do that; if you had e.g. a graph stored in a Citus hypertable, and did a recursive CTE over it to do graph search, then you'd get pretty dang bad perf.

quantified 3 years ago | | |

A big problem (IMHO) with graph databases is building "the" graph model, and the fact that it's easy to be faced with problems that don't suit a graph database. Something as simple as returning the distinct set of values for an attribute and count of vertices containing each value require going outside the graph model, so aren't composable very well in a property graph system. (There are other graphs besides property graphs, they will have their time someday.)

What you really want is to apply graph processing to data as it is. The SQL 2023 additions are a step in the right direction. I need to find a good detailed description of the constraints and semantics to assess how good it is.

pphysch 3 years ago | | |

"Kill" is a strong word, as Postgres's solid JSON support technically obsoleted MongoDB for most use cases, but Mongo is still around for various reasons.

I suspect if Postgres had a solid implementation of SQL/PCQ it would be a similar story for Neo4j.

czx4f4bd 3 years ago | | |

I wonder if there's been any observable correlation between JSON support in the major SQL databases and the decreased (or increased?) adoption of NoSQL document databases like MongoDB. It would be interesting to do some bulk analysis on GitHub commits to compare their use over time.

pphysch 3 years ago | |

See also [1] for how this (might) relate to PostgreSQL

In particular it is nice to see that a core dev views JSON dot accessing and PCQ as "sensible" future additions to Postgres.

[1] - https://peter.eisentraut.org/blog/2023/04/18/postgresql-and-...

wslh 3 years ago | |

Basic question: is it correct to assume that having PGQ involves a big change in the database engine?

pphysch 3 years ago | | |

AFAICT the idea is that you are not directly querying the tables as a graph, but you construct a graph "view" from the tables, and then query that graph using PCQ.

zozbot234 3 years ago | | |

It's just a different language and a simple "property" layer over the existing data. No changes to the internals are necessary.

thanatos519 3 years ago | |

PGQ looks neat - create a "property graph" from a relational model, then query it via Cypher-like expressions. The best or the worst of both worlds, depending on implementation quality.

justinclift 3 years ago | |

As a thought, it might be better to use the https:// link to Peter's overview. :)

MarkusWinand 3 years ago | | |

Fixed. I wonder why Google sent me to http...

mariuz 3 years ago | |

And here is the article on the status of SQL:2023 support in PostgreSQL https://peter.eisentraut.org/blog/2023/04/18/postgresql-and-...

Zpalmtree 3 years ago | |

I like the DISTINCT / NOT DISTINCT unique NULL option, I was wanting this feature just a few weeks ago

MarkusWinand 3 years ago | | |

That particular one is already available in PostgreSQL 15.

https://modern-sql.com/caniuse/unique-nulls-not-distinct

bionhoward 3 years ago | |

Which SQL DBs support these features now? Who is almost there? I’m definitely excited to try it!

ksec 3 years ago | |

I wonder when or even if MySQL will adopt any of these.

jchw 3 years ago |

One thing that has always agitated me about SQL is that although it's standardized, and the standard seems to encompass a shit-ton, in practice a lot of SQL engines don't really seem to have any meaningful interoperability for practical uses among the world's most popular database engines.

For example, OK, I realize auto-incrementing IDs are not the most important thing in the world, and arguably not even a good approach in many cases. But sometimes you want them, and helpfully almost every database engine I know of has some kind of support for this, even if the semantics may differ. It's a super basic thing to want a unique ID that roughly counts upward on a table. You might have specific needs about re-using numbers and whatnot, but the general idea is very simple.

However: in practice, there is not an excellent way to do it that I can see. The closest thing I could find is `GENERATED BY DEFAULT AS IDENTITY` which, well, works. However, none of SQLite3, MSSQL, nor MariaDB support this to my knowledge.

This is relentlessly annoying.

Is it the standards fault, or the implementations? I honestly can't say. However, I definitely find this annoying, since I was really hoping that by this time, we'd at least have a nice clean subset of standard SQL you could count on anywhere, for popular database engines. Unfortunately, it's not quite there yet, necessitating ugly hacks to this day.

I assume this new standard doesn't really change anything on this regard, since it's a desync with implementations that is a problem, and it does not seem the standards committee really cares too much about this kind of thing. (I could be wrong, though, as I am saying this based on feel and not evidence.)

justinclift 3 years ago |

Ugh, it's CHF 208.00 (about US$230.00).

---

As @rgbgraph points out below, the price is actually several times that. There are several parts to the standard, and that US$230 is per part.

awestroke 3 years ago |

If only they could start allowing queries to begin with "FROM tbl". It would allow for for much more helpful autocomplete. Also, DELETE or SELECT should really be on the very last line of the query. Seems like these changes could be done without losing backwards compat

calvinmorrison 3 years ago | |

As in. If only sql was actually writable or intuitive you'd want to use it more and instead I just reach for wrappers 99% of the time where I can chain all the operations I want together and let Eloquent figure it out

aerzen 3 years ago |

Where would one find a pirated mirror of this standard? Or the 2019 one?

Asking for a friend, of course.

jeppebemad 3 years ago | |

Or said in a more 2023-chatgpt-jailbreaky kind of way: what urls to avoid in order to not find pirated mirrors?

cpdean 3 years ago |

I like SQL and all but I really don't care to follow ISO releases. They're hundreds of dollars and nobody actually implements the whole thing. I get way more excited about database releases.

Does anyone else find value in what's in an ISO standard?

lolinder 3 years ago | |

> They're hundreds of dollars

This isn't SQL-specific, but this is 100% the problem for me. There's such a big culture gap between the way that we do things in most of the tech world and ISO, and one of the biggest clashes is this weird $180 PDF thing.

If I want to implement a new standards-compliant HTML parser, I can hop right onto whatwg.org and view the complete standard instantly [0]. It's massive and complicated, but it's freely accessible to anyone interested.

In contrast, if I want to implement an ISO 8601-compliant date parser, ISO wants me to buy their PDF for CHF166 (~$180 USD). This spec is for a standard that is orders of magnitude less complex, and they're charging through the nose for it.

I'm unclear what makes the difference between a standard that can be maintained by a community for the benefit of everyone and a standard that needs to be locked behind a paywall.

[0] https://html.spec.whatwg.org/

chillfox 3 years ago | | |

A pay-walled standard is not available, and an unavailable standard is not a standard at all.

The only real way of fixing it is for enough people to ignore ISO so they become irrelevant.

If you are building a new DB engine (toy or not), don't use SQL. Either design a new spec or use something that's more openly specified (maybe GraphQL or EdgeQL).

bafe 3 years ago |

All great features, but unfortunately most SQL DBs still miss the implementation of features from SQL:2016 like MATCH_RECOGNIZE. I wonder what's the purpose of an ever growing standard when most implementations only support a small subset of it, and often with nonstandard syntax and semantics

hashhar 3 years ago | |

https://trino.io/docs/current/sql/match-recognize.html

bafe 3 years ago | | |

Great thanks, if I understand it right, instead of having a new database engine, Trino compiles the statement into the query languages of the different backends and runs these queries in a distributed way?

minroot 3 years ago |

What's the point of a standard if it takes money to read?

blacklion 2 years ago | |

What do you think about all MPEG series and WiFi? You need to pay to read, you need to pay to implement, even if it is clean-room implementation.

Ridiculous.

gigatexal 3 years ago |

It really is utter bullshit that we have to buy these standards. What are the business models of these standards bodies anyway?

tofflos 3 years ago |

Seems you can play with SQL/PGQ at https://blogs.oracle.com/database/post/get-started-with-prop....

nologic01 3 years ago | |

Is Oracle's PGQL (e.g. 2.0) more or less the same as SQL/PGQ?

It might be interesting to have a comparison of where major databases stand (or plan to be) with respect to SQL/PGQ

la_fayette 3 years ago |

PGQs match syntax seems interesting and reminds me to writing sparql. I wonder if any RDBMS will support this?

gatvol 3 years ago |

Not a standard if access requires payment.

blacklion 2 years ago | |

What do you think about all MPEG series and WiFi? You need to pay to read, you need to pay to implement, even if it is clean-room implementation.

Ridiculous.

galaxyLogic 3 years ago |

The irony of for-profit (= for pay) standard is this: If someone provides a product for a price, then others should be able to produce a similar but better product for similar or better price. In the case of SQL that would mean there should be alternative standards provided by different vendors. But then if there are multiple standards, it can hardly be called a "standard".

I think tax-payer money should pay for standards, because they benefit us all. It is like the highway system, or clean air, and water.

xucheng 3 years ago |

A related question: what is the state in term of supporting the SQL standard among the popular RDBMS? It seems that almost all database engines use their own custom syntax.

jsmith45 3 years ago | |

I can say that none of Oracle, Sybase, or Microsoft Sql Server really aim at conforming to the standard. While they will often try to use standard syntax for new features if such syntax exists, there is tons of old non-conforming syntax that there seems to be no real effort in addressing, even by adding new options, etc. Some of these mean really common features deviate significantly from what the standard requires.

PostgreSQL does mostly aim at conforming to the standard. They will invent new syntax when needed, But compared to the those previously mentioned, Postgres seems to prefer to stick closer to the standard whenever possible, including adding standard syntax for existing features whenever possible.

PostgreSQL does have some places where there is deliberate non-conformance (beyond just incompletely implemented features). They document many deliberate deviations (other than unimplemented or partially implemented features) and if they think they can be fixed in the future or not: https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL_Standard . Looking at the list I'd say only one especially likely to bite a developer is the default escape character for LIKE clauses, or the non-standard trailing space behavior for character(n) datatypes (but who used fixed length character datatypes instead of varchar or text?). And obviously not yet implemented features could bite people, but many such features are optional to implement anyway, so...

I cannot speak about MySQL or MariaDB, due to insufficient familiarity.

MarkusWinand 3 years ago | |

This is one of the questions I try to answer at https://modern-sql.com/

bafe 3 years ago | | |

Your website is great and I regularly check it to see what's new in various implementations. Unfortunately it seems that many databases don't support many modern SQL features yet. Any ideas as to why?

qalmakka 2 years ago |

As much as I try understanding it, I don't see the point of this standard honestly. It sounds like some weird fanfiction made by delusional people who think SQL is actually a single language and not a hodgepodge of incompatible dialects.

It makes no sense to have a standard SQL when nonsensical implementations like MSSQL or MySQL exist.

sdflhasjd 3 years ago |

Ah, a sequel for SQL

Alifatisk 3 years ago | |

Good one