https://github.com/Mobile-Artificial-Intelligence/maid
The UI looks nice and includes a native compilation of llama.cpp.
My main phone's screen broke so I'm on an old Pixel 4 until it's repaired but I've had no luck getting 2-3GB models to run so far.
When I ask it a question it doesn't respond & when you attempt to switch models it starts looking for nearby devices.
It also appears to be role play for some characters, so it's really not a GenAI chat bot, as far as I can tell. https://ibb.co/DRZhZcH
https://privatellm.app/blog/phi-3-now-available-on-iphone-an...
Eg "phi-3 gguf" eventually take you here [1]. and you can download the Q4 quantized model on that page.
[1] https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf...