AFAIK SparkML pipeline is 'just' for text, whereas 'KeystoneML also presents a richer set of operators than those present in spark.ml including featurizers for images, text, and speech, and provides several example pipelines that reproduce state-of-the-art academic results on public data sets.'