Nearly all web APIs get paging wrong

Nearly all web APIs get paging wrong(vermorel.com)

9 points by apievangelist 11 years ago | 11 comments

junto 11 years ago |

I'd like to see this continuation pattern described in a bit more detail. How does the client define the sort parameters or preferred data limits per continuation? Have I missed the point?

quesera 11 years ago | |

Continuations are "easy" (if you have decent language support or can glom it on) but that's keeping state on the server side. The author describes continuation tokens that "never expire", which is incompatible with server-side state.

Without state, a token can be a bookmark into a predefined ordered dataset. That's more reliable than an offset, but just as inflexible, and much more expensive on the server side.

Or, I'm missing something too.

reipahb 11 years ago | | |

One thing to note is that while the continuation token is just a blob of data from the clients perspective, the server can actually use it to store the required state information.

A simple method would be to take the state information the server needs in order to continue the enumeration (e.g. sorting order, how far along it was in the enumeration, etc.), JSON-encode it, encrypt&sign it, and then base64 encode it.

Return that token to the client, and if the client wants more data it can pass that token back to the server, which can decode it into all the information it needs to resume the enumeration.

karmakaze 11 years ago |

I've often wondered how the paging on HN could be better. The main issue is going from page 1 to page 2 where items move between them and I either see items a second time, or miss them. The problem with fixing a sequence on first page load is then when to refresh for new content--only on page 1? Lastly, a prescription is not helpful without a design for efficient implementation. How can this be achieved in a stateless manner?

reipahb 11 years ago | |

The only way I see to solve this without server side state is to replace the page-parameter with a list of items you have seen. The more button would then just find the top 30 items you haven't previously seen. Unfortunately, this would become unwieldy very quickly, and sooner or later you hit the browser limitations on maximum URL size.

A relatively simple approach that involves server side state is to periodically (once a minute?) generate the list of (for example) the 10000 top items.

(A high traffic site will most likely want to do this in any case, so that it has a cached list of items ready to serve to clients, instead of issuing a database query to find the top items for every request.)

Now, instead of overwriting the list of top items every time you regenerate it, keep multiple versions of the list. Then you can make the link to the next page specify the version of the list and the page number. That way, users will browse through one specific version of the list.

(This requires storing some state on the server, but the amount is relatively small. You control both the size of the generated list, how often new lists are generated and how long they are kept, so there is an easily calculated upper bound on the amount of state information you need store.)

karmakaze 11 years ago | | |

This solution satisfies my usage and doesn't use a continuation token, though one could be constructed from version and page. It does however expire.

I can see that the other comments on constructing continuation tokens won't work for HN assuming post upvotes are mutably updated.