Efficient GraphQL Queries in Ruby on Rails and Postgres

Efficient GraphQL Queries in Ruby on Rails and Postgres(pganalyze.com)

176 points by moritzplassnig 6 years ago | 64 comments

> The following query produces the data that we want... we just need to figure out how to write a batch loader that generates the same result.

This is where the abstraction breaks down. Should you care what the query ultimately looks like? For anything beyond a toy, you sure should. But how far should you go to ensure the abstraction produces the implementation you want? Why not just skip to the implementation you want?

Sooner or later, you need to break the abstraction. I'd rather just write the SQL for all my endpoints.

I'm not bullish on GraphQL for the same reason I'm not bullish on ORMs. For some things, an abstraction suffices, but it's far from a universal solution. If I have to break out of the GraphQL abstraction, why bother in the first place? Is there value in a mixed GQL/REST API?

> GraphQL can be as efficient as REST, but requires approaching optimizations from a different angle. Instead of upfront optimizations, we lazy-load data only when required, loading it in batches to avoid excess trips to the database.

Admittedly, the author does a good job of preserving the abstraction, but I'm hesitant to expect most devs to dive this deep, and thus hesitant to expect this level of quality/performance from a GraphQL backend.

aantix 6 years ago | |

>For some things, an abstraction suffices, but it's far from a universal solution

Fair, but the ORM covers 80% of what you need. Mapping to objects, autoloading associations. These are simple selects and joins and make up a large portion of most information based systems.

When something needs performance and the ORM isn't optimizing appropriately, sure, drop to raw SQL and optimize away. Even then, what did you gain? Shaving 100 seconds off a 200 second query, sure that's worth it. But if you optimize and manage to shave off a mere five seconds, was it worth it? Maybe. Maybe not.

For most systems where you're displaying information, you just need good enough, an ORM that feels productive and doesn't get in the way (e.g. ActiveRecord).

tyre 6 years ago | | |

I agree with everything except the five seconds note. Web pages loading in fives seconds is an awful experience.

Optimizing expensive queries not only improves the user experience (be it web or api or whatever) but also can open up entire usecases that otherwise wouldn’t be practical. Five second queries are a primate candidate for DOS attacks, whether intentionally malicious or by users spamming refresh, retry, etc.

m12k 6 years ago | |

I've seen many mobile developers be frustrated at constantly needing to ask the backend developers to make new rest endpoints for them because the existing ones were not exposing the data in the way they needed it exposed. And similarly I've seen backend devs be frustrated at continually having to make new endpoints for those nagging mobile devs. If you have teams working independently of one another, where the backend team is providing a service to the frontend team(s), it makes sense to use a tool like GraphQL to able to go "here's access to anything you need - go ahead and use it however you need" instead of needing to keep coming back and building new endpoints again and again.

If you're the one both creating and consuming the endpoints, then sure, it doesn't matter where you write your query, and building abstractions to move the query all the way out to the client is a waste of time.

lhorie 6 years ago | | |

> here's access to anything you need - go ahead and use it however you need

But that's almost never what you want: as the backend developer, it's one's job to care about security and the performance of queries. As far as I know there's no straightforward way of automagically translating every possible permutation of SQL query primitives into GraphQL primitives in a way that is intuitive to a frontend developer, reasonably performant and secure, so at some point the backend dev has to step in and do the work of connecting the GQL layer to the SQL layer.

earthboundkid 6 years ago | | |

It seems that the problem that GraphQL solves is Conway’s Law. For an org that can keep the backend and frontend together, it’s not much of an improvement.

aidos 6 years ago | |

There’s a different way of managing graphs queries. When you remember that you’re effectively running a query per layer of the graphql tree (much like an ORM does) then you can optimise all the stuff that sits in a single layer by creating a view to aggregate as required.

You get to keep the data aggregation complexity in the data layer and then have a loader that sits above it which is effectively just choosing which subsection of the tree to load.

I’d recommend looking at Hasura which does a great job of this.

atombender 6 years ago | |

The problem with "just SQL" is that you arguably want to compose queries.

A foundational observation that GraphQL makes is that every query is different depending on the client's needs. You can do "SELECT * FROM users JOIN companies ON companies.id = users.employer_id" and call it a day, but now you're already probably overfetching, and the next evolutions of your codebase ends up accruing extra logic to add or remove features: Some queries want to fetch just a subset of fields from each of those relations. Some queries don't want to fetch any companies. Some want to paginate, filter, order in different ways.

You can do all that in REST, of course. You can use an SQL builder to build the final SQL query programmatically. You can add WHERE clauses and FULL OUTER JOIN clauses to do your N+1 queries on foreign associations, with some kind of query parameter to define what associations the client wants. Field selection is also doable with query parameter and so on. You can also centralize a lot of the boilerplate to do this (e.g. FULL OUTER JOIN clauses on associations, if you do them, need to be untangled into object graphs after you get the flat results back).

And so on.

What you end up with is pretty much GraphQL. You need code — identical for every relation, modulo client-specifiable constraints and parameters — for filtering, pagination, selection, joining, etc., all of those things. You need some notation for letting the client specify what they want to fetch, both fields and associations. Then, even when you have ended up with something approximates GraphQL, you have to write a custom client for it that only works for this exact API. And then you have to write documentation. And tests, tests which don't just exercise your "business logic", but also exercises all the REST-based query API stuff.

The benefits to GraphQL are that it's standard, so you can piggyback off existing libraries both on the client side (things like Apollo) and on the backend, libraries that are already tested and documented. GraphQL itself defines a schema, which serves as documentation and a way to machine-generate clients. As a case in point, try out Prima's GraphQL playground app [1]. It's basically a web-based IDE for GraphQL. You can point the app at any GraphQL API, inspect the schema, write queries (you can autocompletion of all fields, queries, mutations, etc.) and see live results. It's a really productive debugging tool.

That alone is worth a lot, and that's why I'm bullish on GraphQL.

[1] https://github.com/prisma-labs/graphql-playground

ako 6 years ago | | |

Or you can use OData, which has the same benefits as graphql.

onemoresoop 6 years ago | | |

I think GraphQL will certainly find it's place after the initial hype. There are some good use cases. But in my opinion it should be only used on a need basis.

ben_jones 6 years ago | |

I agree but can't ignore this idiom

> If you start without an ORM you end up writing a custom ORM eventually and often that custom ORM is inferior to the third party one you're considering

Funny enough this engineering problem becomes an HR one:

If your recruiting and training processes can consistently hire devs actually fluent in SQL and web application architecture you can completely avoid the ORM and just write good modular and testable code that avoids the chaos of poor SQL performance etc.

If you're like most organizations and have trash recruiting and training processes for developers, you don't have the luxury of trusting your teams not to fall over with technical debt, and may prefer to use an ORM for immediate short term complexity and also long term technical debt management (since the ORM effectively becomes a framework with consistent patterns that can scale across teams).

gav 6 years ago | | |

I think this is a really great point, as often the HN answer is "hire better developers". The majority of companies outside the HN bubble are full of very average developers. ORMs work well in that sort of environment as it allows them to be productive, and any performance issues can be tackled by the less-average developers on the team.

A while back, I worked with a client who's stack was predominately Java but due to company policies paid 25-50% under the going rate for decent developers. This caused two problems: not being able to hire enough developers, and some of the hires not being great.

We solved that problem with a technical one: the stack moved to Clojure powering backend services with Node.js on the front end. The coolness and newness of those technologies at the time meant they could make hires. Ironically, now the demand for those technologies means they command high salaries and they're back at square one.

Bit of a long story, but my point is that sometime you need to solve people problems with technology and technology with people.

wolco 6 years ago | | |

Remove ORMs and write raw queries for everything? Or write our own orm/query builder?

I didn't like orms much at first but with some practice you can shape your queries. Much neater and cleaner if that matters in your case.

cuddlecake 6 years ago | |

> This is where the abstraction breaks down.

No. In REST, the client is responsible for resolving IDs into entities.

In GraphQL, that responsibility largely moves into the backend.

This is not "breaking the abstraction", this is exactly what GraphQL is good for.

yxhuvud 6 years ago | | |

It breaks down, when you write an API to solve a task, rather than when you write an API to expose resources and then let the consumer of the API define the task.

sandstrom 6 years ago |

We've looked at GraphQL but ended up going with Graphiti instead.

Only a few weeks since we implemented, but happy so far. Feels like it's the best of REST and GraphQL.

This page from Graphiti[1] explains the thinking behind it. Don't be discouraged by the loud pink colors ;)

[1] https://www.graphiti.dev/guides/why

Deimorz 6 years ago | |

Thanks - I hadn't heard of Graphiti before, but it looks really interesting. I like a lot of the aspects of its design and capabilities, I'll definitely need to read more about it.

xtagon 6 years ago | |

It's strange to me that they pointed out Domain Driven Design without any mention of CQRS (which isn't strictly linked to DDD, but is commonly associated) or how Graphiti would help with some of the pitfalls of CRUD-based APIs. In fact, earlier in the post they point out that the built in "verb" links feature only works with CRUD, and all of there examples are CRUD based.

mdaniel 6 years ago | |

I guess I shouldn't complain in a thread about GraphQL on Rails, but it's disappointing that Graphiti appears to be only for Rails, and not a general purpose mechanism/specification like GraphQL that it complains about

SketchySeaBeast 6 years ago | |

I was excited to look at a REST/GraphQL alternative, but that website doesn't fill me with a lot of excitement or enthusiasm. The styling choice is fairly off-putting, and the first line of the "why" has a misspelling.

sandstrom 6 years ago | | |

Yeah, I know, it's subjective but I also think the design is awful. It's good software though!

bdcravens 6 years ago |

I'm leaning towards letting the elements of my app that need to be in Rails be in Rails, but moving the GraphQL layer to Hasura, bypassing the ActiveRecord complexity. Has anyone else taken this approach?

aidos 6 years ago | |

Yes. Well, we’ve built a prototype recently that just uses Hasura for the backend (don’t need anything other than data access at this stage).

It’s a really great engine and I’m glad we took a punt on it. Aside from the expected behaviour, you get some nice bonus stuff like nested updates for free.

We mostly tried it out because we normally use python and we wanted subscriptions. The graphql story in python (especially async) isn’t amazing. When we need more business logic, we’ll build it in python and then you can get Hasura to do delivery to python (to ensure each event is run exactly once).

I’d recommend kicking the tyres to see if it works for you. The team are awesome too, and moving quickly. I have no affiliation but I think it’s going to do really well.

bdcravens 6 years ago | | |

How's the performance? As I recall, the way it does relationships is pretty sane as opposed to AR's defaults of N+1.

SPascareli13 6 years ago | | |

Have you tried Prisma? How it compares with Hasura?

honopu 6 years ago | |

I'm currently working on relegating the Rails app to the API, 95% of queries will be handled by elasticsearch, if it makes sense to make a GraphQL client, I will. Most of our searches are pre-defined, we're using elasticsearch mainly for boosting and relevance score, autocomplete, and having a unified search across a bunch of different installations such as a rails app, wordpress and a few others.

therusskiy 6 years ago |

to solve N+1 in Ruby Graphql automatically I wrote a thingy that analyzes what fields were requested in a recursive manner and then uses a preloader at the root level to load everything in advance. Solve 80% of N+1 problems automatically.

rytill 6 years ago | |

Don't know why this isn't builtin to GQL libraries - seems like the whole point of GQL is to do these types of queries and it's horribly inefficient by default.

cpursley 6 years ago | |

Sound useful. Is this available as an open source lib?

rectangletangle 6 years ago | |

I ended up doing something very similar for Django's ORM years ago, before GraphQL really came into popularity.

johnymontana 6 years ago |

Another option for solving the N+1 query problem is to generate a single database query at the root level by inspecting the selection set of the GraphQL query. This is the approach used by Neo4j GraphQL (and other GraphQL database integrations): https://grandstack.io/docs/neo4j-graphql-js.html

yxhuvud 6 years ago | |

That might be doable in Ruby using the Sequel library, but if you are stuck in a Rails codebase using Activerecord, the amount of query complexity until that approach break down is not very big.

e12e 6 years ago |

Somewhat on the side, but:

> By using the includes method we've been able to knock our queries down from six to two: The first to find the events, and the second to find the categories for those events.

Am I the only one that thinks rails/activerecord is crazy here for (often) preferring two trips to the db over a join on indexed primary keys?

Are there really (sane) use-cases where that is better?

chx 6 years ago |

After https://stackoverflow.com/a/57247614/308851 I am at a complete loss of why I would I want GraphQL. A client uses it , so I obey but I do not see the point. To quote the linked answer to save you a click:

> GraphQL does not provide any built-in mechanisms for filtering, sorting, pagination or other arbitrary transformations of the response

lozenge 6 years ago | |

You wouldn't want to let people filter by an unindexed field, sort huge large result sets, or make your server store a huge result set because they stop making requests after getting the first few pages. GraphQL lets you express what is permitted, without allowing more than your underlying data store can support.

For example, from the GitHub GraphQL API you can in a single request get all open pull requests on a repository and the last 10 comments on each one. You can't get the last 10 comments on all of GitHub or the last 10 comments made by a particular user, because they haven't exposed or even stored the data that way.

In a REST API that would be dozens of separate requests, or the endpoint would just not exist because they have never had the need for that exact combination of records to be queryable.

krschultz 6 years ago | |

GraphQL the language does not specify those features in a rigid way, many GraphQL implementations have those features. It's best to look at the spec as the minimum to expect not the maximum you can get from GraphQL.