Ask HN: Using Kafka and Storm/Spark for pub/sub with high number of subscribers I'm building a realtime app which requires publishing messages to the connected mobile devices. Traditionally, I would implement polling or refresh on schedule on mobile device to look for updates but I would like to update the app in realtime since the delivery of the message to the mobile device is time sensitive. The architecture I'm thinking of implementing is every subscriber (mobile device) will their own topic in Kafka since the message is user specific. So if I have 100k mobile devices connected, I would have 100k topics with 1 consumer each. Is this efficient and can Kafka handle it? After doing some research, I figured out that I can use Apache Spark or Storm to do the data computation before sending out the message to the consumer. My app design resembles of an stock app where I get the raw data from an external API and then process data on server side before distributing it to the connected clients. My main concern is processing. If I have 100k clients connected, I would have to process the raw data for each connected client so that I can send the computed message per client. This could be resource intensive. Are there any other design paradigms I can use to solve this issue? Has anyone implemented something similar to this? Am I over engineering the solution? |