The reality is Apple deeply cares about releasing a polished product. Releasing an LLM-based Siri that makes really bad gaffes would be a PR nightmare. Google already suffered that when it opened up Bard despite pushback from folks working on it.
The fundamentals of LLM architectures, training, etc. are now no longer "secret sauce" tons of major tech companies are working on in-house LLMs at this point. I don't see Apple as not having a future for Siri, it's rather a silly conclusion that doesn't have much behind it at all
This is absolutely untrue in the Tim Cook era. Apple releases buggy hardware and software all the time.
This article talks about how they released a bad product and know it. Siri is significantly worse than both Google Assistant and Alexa, and even those are comically bad sometimes.
Bugs in iOS and Siri give me rage aneurysms every single day (coming from a long-time Android user who just wanted a small phone).
I've had great experiences with Apple hardware, personally, but the software frequently makes me shake my head.
Didn't steve jobs sit behind antennagate and bendgate?
Today, I am fortunate enough to be developing 'characters' and backstories for AI companies, and I think its working; the new prototypes I've been working on really feel more warm (or more logical, or more relatable, or more philosophical, depending on what we're trying to transmit).
I think a lot of AI companies are really missing this important point, and I think Apple, of all companies, should have got this!
I have been using it more and more to send email and messages.
Siri’s most common response to me: “You have to unlock your iPhone to do that”. Not helpful.
This stuff needs to be optional and not forced down our throats, and it needs to be more user-controlled, work when we want it and not when we don't want it.
Risk aversion from immature tech is a part of Apple’s DNA. It should be no surprise that Apple declined to push Siri beyond its limited capabilities.
LLMs are quite new and Apple has plenty of cash to hire star devs to catch things up. Writing off Apple’s AI future is premature.
Boy, that Simpsons episode where they make fun of the Newton really burned them bad.
WebGL has been screwed since 16.4
Siri: let me see what I found on the web about f(misspelling, $any_question?$)
They do a much better job of unifying device experience through software.
> By 2018, the team working on Siri had apparently "devolved into a mess, driven by petty turf battles between senior leaders and heated arguments over the direction of the assistant." Siri's leadership did not want to invest in building tools to analyse Siri's usage and engineers lacked the ability to obtain basic details such as how many people were using the virtual assistant and how often they were doing so. The data that was obtained about Siri coming from the data science and engineering team was simply not being used, with some former employees calling it "a waste of time and money."
> Many Apple employees purportedly left the company because it was too slow to make decisions or too conservative in its approach to new AI technologies, including the large-language models that underpin chatbots like ChatGPT. Apple CEO Tim Cook personally attempted to persuade engineers who helped Apple modernize its search technology to stay at the company, before they left to work on large-language models at Google.
> Apple executives are said to have dismissed proposals to give Siri the ability to conduct extended back-and-forth conversations, claiming that the feature would be difficult to control and gimmicky.
> Cook and other senior executives requested changes to Siri to prevent embarassing responses and the company prefers Siri's responses to be pre-written by a team of around 20 writers, rather than AI-generated. There were also specific decisions to exclude information such as iPhone prices from Siri to push users directly to Apple's website instead.
> Siri engineers working on the feature that uses material from the web to answer questions clashed with the design team over how accurate the responses had to be in 2019. The design team demanded a near-perfect accuracy rate before the feature could be released.
> Engineers claim to have spent months persuading Siri designers that not every one of its answers needed human verification, a limitation that made it impossible to scale up Siri to answer the huge number of questions asked by users. Similarly, Apple's design team repeatedly rejected the feature that enabled users to report a concern or issue with the content of a Siri answer, preventing machine-learning engineers from understanding mistakes, because it wanted Siri to appear "all-knowing."
> In 2019, the Siri team explored a project to rewrite the virtual assistant from scratch, codenamed "Blackbird." The effort sought to create a lightweight version of Siri that would delegate the creation of functions to app developers and would run on iPhones instead of the cloud to improve performance and privacy. Demos of Blackbird apparently prompted excitement among Apple employees owing to its utility and responsiveness. Blackbird competed with the work of two senior leaders on the Siri team who were responsible for helping Siri understand and respond to queries. These individuals pushed for their own project, codenamed " Siri X," for the 10th anniversary of the virtual assistant. The project simply aimed to move Siri's processing on-device for privacy reasons, without the lightweight, modular functionality of Blackbird. Hundreds of employees working on Blackbird were assigned to Siri X, which killed the ambitious project to make Siri more capable.
This seems completely dysfunctional.
[0] https://www.macrumors.com/2023/04/27/report-details-turmoil-...
Me: 16 minutes
Siri: your timer is set for 16 minutes
Me: 17 minutes
Siri: 17 minutes and counting
Me: 18 minutes
Siri: I don’t understand that
Some article like this comes up every so often and it's so non-newsworthy. Anyone who's been at a project planning meeting knows there's a small contingent of engineers in the corner muttering that it's a terrible idea and it will never work.
I know a bunch of capable programmers that would love the chance to go into one of these specialties but don't have the resources nor opportunities (time/money/location) to go back to school.
Siri could be so much better.
Why can't an eCommerce app implement, "Hey Siri, tell Online Store to add Tidepods to my weekly order". ?
What about Audible implementing, "Hey Siri, tell Audible to add Tale of Two Cities to my reading list." ?
There's remnants of a car service interface that never got used or shipped AFAIK. If you ask "Hey Siri, get me an Uber to the train station". It just replies "sorry, I help with rides".
Or maybe even work with other Apple stuff better. I can say "Hey Siri, turn on Television" and that works and then say "Hey Siri, mute television" and it will reply, "There are no televisions to control".
https://developer.apple.com/design/human-interface-guideline...
He (she's set to an Australian male voice) is great if you know exactly what to say to get exactly what you want. He's horrible with general requests that you haven't made before.
Particularly with smart home devices.
If you say "Buy a two by four" the transcription goes from
Buy
Buy a
Buy a two
Buy a two by
Buy a 2 x 4
"Four inches by three inches" For
Four inches
Four inches by
4" by 3
4" by 3"
And that's all on device.The issue is that when going to GPT, that's going to cost someone a few pennies each time a request is made. When that's scaled up to the installed base of all Macs and iDevices that gets expensive fast.
It also needs some persistence. It needs to know that I never want to play music on my homekit. I tried deleting all the music from itunes but that stupid free U2 album keeps coming back and it often thinks I want to play music instead of doing automation actions.
And it should be able to read my texts and tell me if something important comes in. Stuff like that. AI models should be able to deliver those things. All this manually scripted stuff is a dead end.
Then I use Siri a lot with my HomePods for music. It works rather well, but when it fails it hurts. Sometimes a new song is playing that I like. So I say "Hey Siri add this song to my inbox playlist." Siri then occasionally tells me "OK, I'm playing some-other-song-you-didn't-ask-for." There is then no way to return back to the previous song or find out what it was.
I hate Siri when these things happen.
Sometimes she'll also misinterpret commands for unclear reasons. "Tomorrow at 7am, remind me to call John" and she responds "Ok, I've turned on your 7am alarm". I try again, speaking more clearly, and she says "Your 7am alarm is already on"
Unlike the Newton they don’t seem to have fixed it … or better yet, replaced it.
I’m surprised Giannandrea hasn’t junked it by now.
- private, on-device language model execution (see llama.cpp for feasibility)
- a single, consistent AI available with you wherever you go
- total access to your personal information / documents (knows your birthday, can see your meeting notes)
Because they have the hardware to run it locally, they have three very hard-to-beat advantages:
1) privacy, because the LMs can see all your stuff but none of it goes back to apple. Microsoft can't do this; they get flak every time they try and phone home with telemetry, and they don't control their platform enough to run a massive LM in the background.
2) omnipresence: if you're in the apple ecosystem, you'll always have your iphone with you. That means the LM will have access to location data, maps, chat - everything. And since it never leaves the phone, privacy-oriented people may be ok with it. And that means the LM can be exponentially more useful than just summarizing documents.
3) evaluation costs - they are the only competitor who will not have to pay for a massive datacenter, which means that the LMs can be as powerful as the M2+ hardware they sell. Everyone else will have no alternative to running the LMs on their centralized, expensive hardware.
Siri is fine at controlling smart home devices, giving weather, timers, etc...but if you ask it questions more times that not it wants you to use your phone. Google Home handled that so much better.
We have Apple Music and have Homepods around the house so we use it for that.
"hey siri what timers are running" -> "it's 10:45am"
"hey siri how long left on the timer" -> "there's one timer with 5 minutes remaining"
<bedroom homepod>
"hey siri list all timers" -> "there are no timers set"
"hey siri how long left on the timer" -> "it's 10:45am"Granted, I have very few things I use it for, but I like it. The kids like them in their rooms too. I’m happy with Siri, but maybe I don’t ask it to do much.
I think the secret is low expectations.
So they're clearly still in prime position to make their product better, even if they're behind right now. That's unless they get an antitrust lawsuit for unfair monopoly, which they probably should get considering this seems to me similar to what Windows was doing forcing IE as the default browser back in the days. Just like people should be able to choose their default browser, they should be able to choose their default voice assistant.
Recently, when I ask it to "Send a text to <name> that says <content of text>" it says something like "I notice you often send texts to <name> using Apple Messages, so I will use that for this text. Is that Ok?" before reading me back the text and then sending it. I'm sure that there are people out there who have a rich and complicated mapping of text-communication-app to recipients, but I literally have only one text communication app on my phone, the one that it came with, and I only ever use that. It's already annoyingly slow to interact with Siri on a multi-step process, and adding another step to it is awful.
I can't wait for them to start including LLM-AI voice chatting where one can properly tell the phone what I want it to do.
I'm having a good time playing with LLM's, but I certainly don't trust any output at face value. I know I would value a correct answer from Siri, or any other service.
Oh yeah, the keyboard replacing fuck with duck even though I never wrote duck until now is enough reason to pop over to Android.
For example you might search for a street in maps and search will just give up and just give you a half arsed result. Searching for HN Boulevard might instead HN Avenue even though you can see the correct result right there on the map.
You see similar behaviour in Music, App Store etc.
It's more useful than dictating when I need something quick and discretely.
When chatbots arrive on iPhone, this is how I'll be talking to it.
I don't trust any voice assistant because I know every query is being stored, analyzed, and I don't own my own communications with the asssistant. I also don't get to customize the assistant.
It's basically a shitty-future branding assistant who is inclined to send you to product pages and shit to buy.
One of the more impressive things about using the Google's voice assistant was, it did very well in noisy environments. Whereas with Siri, that is quite a bit of struggle. This is only about speech-to-text, not text-to-whatever.
Today: “Hey Siri, watch a movie in the dark” “I can’t find a movie called In the Dark.”
I also had a very frustrating issue randomly start happening because one of my lights had the word “lamp” or “light” in it. Thankfully googling the symptoms found others with the same problem and a solution, but it was baffling as there hadn’t been any changes made by me in over a year prior.
It usually goes like this:
Me: Hey Siri, “bedroom on” in 15 minutes.
Siri: who is speaking?
Me: $Name
Siri: I don’t recognise your voice… setup personal requests bla bla.
Me: Hey Siri, who am I?
Siri: You’re $Name!
Me: Hey Siri “Bedroom on” in 15 minutes.
Siri: Who’s Speaking?
Me: @%#+#!!!
Siri then goes into a loop asking who I am and then ends with “Hi!”
Five minutes later,
Me: Hey Siri, “Bedroom on” in 10 minutes.
Siri: okay $Name I’ve set bedroom on for 12:08.
So it’s not the voice recognition, it’s whatever’s going on after that step.
When I try to use it I get consistently laughably bad results. `Play songs by Albert King on Spotify` yields something like "Playing songs by R Kelly on Spotify". No, thanks.
Even when I ask for songs, albums, or artists I have saved in my library, Siri invariably finds something else entirely unrelated to play.
What's really frustrating is that this used to work reasonably well for me.
timerId = setTimer(float time);
function updateTimer(timerId) {};
maybe wrap that in a class, and use that when using Siri. you just have to speak the class definition first.
Hey Siri! set a timer using myCustomClassTimer!
I use my Google Home similarly. But understand that these use-cases are utterly trivial, and Siri (by most accounts) and Google (from experience) still manage to get them wrong too often. It's unbelievable.
I have Google Home products and a YouTube Premium subscription. By 1 of 10 times when I ask for music it'll default to Spotify, which I have never used. How? Why? One of my speakers now reacts to every request with an error when it's not the primary speaker being spoken to.
It's a novelty, nothing more. I wouldn't rely on this stuff for anything. In fact, it's a bug or two away from being permanently removed.
I have a Mini that consistently misunderstands broadcast requests and says "sorry I'm not playing anything right now". When it occasionally speech-to-text converts a broadcast word, it consistently cuts off the first word or letter, even when it's "I'll be right down" users will get "L L be right down".
It used to support simple offline requests like SMS and Navigate when data was unavailable. No more.
It used to integrate with Google Keep. No more.
No longer recognizes the word "torch" as a synonym for flashlight. Why would I be asking to turn on my phone's "porch"?
Painfully slow replies, even under ideal network environment... Just spinning forever, often until a timeout that it doesn't even have the decency to respond with a proper error message.
It's just amazing how they launched a product with a clear "this is where we are, this is our vision of where we're going" and they still sell it but instead they're going in the opposite direction.
I must disagree. ChatGPT-style LLM functionality with ElevenLabs-quality realtime voice synthesis will absolutely supercharge these products. The ability to e.g. answer kids' questions in simplified English according to parental prompt guidelines, or drill down on complex educational topics, or maintain context over many back-and-forth conversational interactions will be huge.
This is mind-blowing to me because I'd guess that ~50% of my interactions with Siri are buggy. I mostly use it through CarPlay.
It even fails to call the correct contact in my phone, which should be the easiest thing it does.
Yeah, literally all I use it for is to set a timer when I've put warm beer in the freezer.
I’m also surprised you’re surprised?
If it can do something — well, that’s the usecase. It just works.
If it can’t do - then why would even one be trying to do that! That’s not how it’s done.
That’s the Apple way. And that’s exactly how Siri is perfect!
B) It's not always a recognition thing. Siri often knows what I said and has a completely baffling response to it.
Google Maps: I swear, about half the time I try to activate voice search, it sits and spins before even accepting any voice input at all. Why can’t it just start reading the microphone right when I activate it, and then submit the saved audio whenever it’s done getting set up? It’s so abysmally poor that it’s usually faster to scroll through recent destinations or literally grab the phone, unlock, and put in a destination.
This is just a market begging to be disrupted. I want to see a startup combine Whisper, GPT, and a competent TTS model into a killer voice UI!
What do people use it for? mostly just to set alarms or turn of lights etc.
With all this breakthrough in AI and things like Whisper which brings incredible voice recognition, these tools are ripe for an upgrade even though they have been stagnant for years.
ChatGPT like abilities + voice assistant and these 'voice assistants' are relevant again and actually deliver on there original promise.
Me: "Siri, give me directions to vaguely ethnic sounding café" where café is about 2km away.
Siri: "Getting directions to other cafe in London/Europe/Tajikistan".
So theres no line of code that says "If found location if >2000kms away with no direct land route and/or crosses multiple continents, it might not be the right one"
Google Maps and voice works pretty flawlessly.
That's about all of Siri I've ever found useful in the past 10 years or so
The only times I ever try to use it intentionally (like asking it to take a note, or make a hands free call while I'm driving) - it screws up or requires clarification to which I have to give screen attention, defeating the purpose.
Most of the time it just pops up and annoys me when I call my wife "sweetie".
But now after some remarkable experiences using GPT-4 I find I’ve lost a lot of patience with all the different voice assistants. They are just so stupid in comparison. How much longer before LLMs and projects like Whisper run the backend?
Then again the only thing I've used it for since launch is setting timers, rarely gets that right these days.
My wife has an iPhone and it's hilariously bad at this. She needs complete silence and even then it's a coin flip.
For $1B+ invested.
Said in the tone of Cobie Smulders in How I Met Your Mother (Nobody asked you PATRICE!)
If anything, giving LLMs access to apis makes them less likely to hallucinate.
If they have control of door locks, there's a problem.
ChatGPT has gotten us there in writing, and the AI generated vocals are already really close. Unless apple acquires OpenAI, Siri is doomed.
They surely have enough ressources to train a competitive LLM themselves...
I think that all the deep learning models for handling speech on the Apple Watch are run locally on the watch.
https://www.theinformation.com/articles/apples-siri-chief-st...
You have to be aware of what makes a good Siri question or task. Technical people tend to understand this implicitly; we are, after all, talking to a computer, and computers are notoriously literal and have trouble with implied contexts, etc.
I think I've talked about this here before, but my wife often phrases questions to Siri in a way that results in the dreaded "I can send the results to your iPhone" non-answer. One example I remember happened when we were idly talking about King Charles. My wife asked Siri "how old is king charles" and got the non-answer. I asked Siri "what year was king Charles born" and got hard data back.
It's that kind of thing.
In the narrow case of music there's more to complain about, I guess, but the base problem is specificity and name collision. It doesn't seem to always pull the example of any given non-unique name that I might want; sometimes I wonder if what i get is just random.
If you ask for "Take Five", you MIGHT get Dave Brubeck. I'd argue that, in the absence of something specific, you SHOULD get Dave Brubeck, and moreover you should get the album cut from "Time Out." But Siri doesn't really agree, for whatever reason.
OTOH, if you ask Siri to play "Take Five from the Dave Brubeck album Time Out" you'll get exactly what you want.
Siri excels in simple, discrete asks or tasks, though. We both routinely use it to add things to the shared shopping list we keep in Reminders. That's kind of awesome, and beats the old norm of "go find a pen to add this to the list that you may or may not remember to take with you when you go shopping." Setting timers or alarms verbally is awesome. The list goes on.
I tried it with your prompt and Spotify started playing the whole album. Which is better than my usual outcome. Of course, it took three tries. The first two times Siri said it was going to play and nothing happened.
Usually I don't even get the same artist or album I ask for. Admittedly, I stopped trying a few months ago, so maybe they've improved things a bit lately, or maybe I just got lucky.
It was a novelty touted as a big leap in technology.
This lets me have 100% control of every feature/button and every app whilst I drive.
I can swipe on the screen using my voice.
I can write notes and messages.
I can switch to any app.
Voice control is a world leading accessibility feature that is a complete touch replacement controlled via voice.
Once I no longer need voice control I turn it off via Siri. The entire action of turning it on, using it, turning it off occurs with no touch at any point.
Also it's great for when I am cooking or doing diy and don't want to touch my phone.
In comparison Siri is better at a few narrowly defined actions but has less than 1% of the scope of voice control.
Reading people's complaints about Siri in this thread I feel what most people actually want is voice control.
But no one outside the visually impaired community knows about it.
That's how people describe it anyways.
"hey siri, set a 30 minute timer for the rice."
it works just fine for me, use it all the time.
>There's already a [number] minute timer. Replace? Confirm/Cancel
Running iOS 16.1, so unless they added this feature in the last few months I'm not sure why we'd see different results.
edit: Perhaps you're using HomePod? Apparently Siri on HomePod supports multiple timers, while Siri on iPhone does not. Goofy.
I mean, I guess a watch is supposed to be for telling time or whatever, but this is just a weird feature omission…
I know, right?
Also a little known hack to set a timer you don’t have to say the whole thing just press the Siri button or say hey siri ‘n minutes’ and it will set a timer for n minutes.
Latest update(?) it quit announcing expiration by name if only one was running. Just beeps, and maybe I forgot what it was supposed to be timing. Annoying.
Especially annoying if more than one person in the house is setting timers.
Input recognition quality falling all over the place - it's very much noticeable with their onscreen keyboards as well. Some "intelligent" mechanisms know better which button I tried to tap and I'm getting gibberish even with auto-correct off. It's the same on iPhone and Watch.
Apple has already forced me to re-learn keyboards once - with the MacBook Pro 2017 fiasco when I had to start carrying external keyboard with my "mobile" laptop. Haven't bought a MacBook since.
I'm primarily a Windows wizard, I use my Macbook for specific tasks and it's not my daily driver. I can type on my Macbook mostly fine, but occasionally it misses some of my key inputs. I know the keyboard works, the key in question types in properly when I press it again, but nonetheless I have the occasional input go missing as I type.
I don't know why this is the case, it annoys me, and I'm left wondering if it's the keyboard with no obvious defects or my typing which works flawlessly with literally all other keyboards I ever come across.
Did they test this at all? Why would you ever pick a more complex/verbose option from the results list?
I use Android Auto in my car for safety. God forbid I'm listening to a song and want to queue up another one right after. I don't think I've ever gotten that behavior to work without either skipping the current song or adding the second song to the end of the queue. And even that's assuming that it even recognized the proper song out of my library and didn't try to go to Youtube or something.
Meanwhile, if my buddy is riding shotgun, I can just say "Hey, put on Peace of Mind next"
“Siri, play Beethoven”
<plays a Beethoven-derived hiphop track>
!!!
20% of the time it can't play songs on Spotify, and I have no idea why.
My 'favorite' Siri 'feature' is when I ask for directions to 'home' it gives me directions to the local Home Depot.... really???
EDIT: After thinking about this for a while, the reason this irks me is that 'directions to home' is one of the most basic asks. I am not asking Siri to play some obscure unreleased track from an underground British punk band. I am asking for directions to my home, and it fails.
If I spell or type out the address, it still adds the E, with no apparent way to get to my actual home address.
As a bonus, last night I said “cancel my 6am alarm” and it said “you have 29 alarms around that time”, and proceeded to read through each of them. My wife nearly died laughing as it started to rattle them off.
I'm mindful my English pronunciation is not 100% accurate, but Alexa gives me no problem at all.
I don't do lots of home automation mostly because the automation isn't practical in a new build house that has switches in appropriate places. There are extremely rare circumstances where I want a light on/off in a room I'm not in. However, it would be a fun project to build a nice remote control with switches and knobs that you can program and have it talk to the home automation solution. To make it even more fun, I'm talking hard flip switches and volume like knobs, really tactile.
well sure, not funny for the person actually going through it, but as the premise of a skit seems to hold some promise.
For what it's worth, I think Alexa has gotten worse, but for different reasons.
It was called Ping for anybody wondering.
"iTunes Ping" or "Apple Music Connect".
Yes, they tried it twice.
I just set timers using the crown on my watch now, combined with custom google assistant commands to control what I need (lights, tv) through my Nest. Playing and controlling music is too much of an extreme sport for my taste on either assistant, and I've stopped using it for calls too as there's always a 5% chance it will call up some random person I knew decades ago but haven't removed from my contact list because reasons. The name is of course totally different from the few persons I do call on a regular basis.
It's great for joking around with my kid though, laughing at the misinterpretations and stuff like that. And that's about it.
- After four years it still doesn’t understand my youngest son’s name. I added a phonetic spelling to his Contacts card. I told Siri 200 times: “my son’s name is pronounced XYZ”’ I religiously corrected his name a 1000 times when text to speech misunderstood it. Nothing. Joke.
- Siri is triggered by anything that even vaguely resembles “hey Siri”: “…they seriously…”, “…ok see here…”, “easier”. And once that dumb piece of %$£~ is triggered she HAS to finish her cutesy “listen to me being a helpful and funny assistant” sentence. Again: sad joke.
Most false positive triggers stem from Audible books as well as audio play from the Apple Audio.
Sometimes I feel Siri is desperate to be triggered, because if hardly ever, I use it in the car when I have to change direction and need to set a new course on a Map app.
The most striking example for me is that the word "half" - as in, "Hey Siri, set the lights to half" - stopped working for six months, then started working again.
Has it really or have your expectation increased instead? Or is it because they initially only worked well with business us english and have been trained over the years to support many more languages, dialects and even variations of said dialets/language + slang and profanity?
Also at the beginning people where talking to these slowly, articulating every word. Now everybody take it for granted they should be understood and talk in their lazier and more natural way.
I've never activated speech recognition on any device I own. But last week I was in a videocall with my partner's mexican family and early in the call they were trying to make my mother in law's Amazon Echo to stop the music and it only really stopped when my sister in law started using profanity words, something like "no mames! chinga to pinche madre Alexa cayate!"
Apparently this stuff became so used/trained by people not using them in a polite way that they eventually only react when you talk badly to them using slang and profanity.
I've started turning many of the autocorrect-related features off entirely.
I use it. I don't like it. (I haven't found anything else that's much better, and that's with a fair amount of SwiftKey use; nothing beats now-dead Swype.)
Routinely - routinely - I try to swipe out "and". Often, I get "abs" (I don't work out, let alone write about it), but the most common and bizarre one is "Abbas". The only Abbas I know is the Palestinian president, and I don't write about him at all, let alone enough to justify having that be a common word that comes up.
I realize that with put/out/or it's just not easy to distinguish. But I'll live with that. Abbas?
Even basic stuff like changing the lock rotation can't be done with Siri. I just don't understand how it could be that bad. I feel like I could sit and code a better parser that would be more useful. And Siri was the first really big voice assistant, they've had 13 years to get it working.
"as an ai language model, I don't know your location"
This isn't unique to Apple: my Amazon devices have definitely become slower to respond, and less accurate when they do, than they were a while ago.
I think part of the issue is that the companies have not found a way to “properly” monetise the services (my echo thingies where dirt cheap and I've never done anything with them remotely like making a purchase) so they are essentially throwing money at the infrastructure to keep them running due to the fear of the backlash if they just let the services die.
I don’t even attempt to talk to assistants in my native language because I know they’ll fuck it up. I just go straight to English.
After too many frustrated attempts, I found a crude fix for people I call frequently - use the shortcuts app to set a single word to trigger that specific call. Use words that are hard to misunderstand, like "newspaper" or "fantastic". Silly hack but so far it's been working flawlessly.
Probably should be careful with "I am sinking!"
Wat are you sinking about?
When I am on the confort of my house, I want to use my native languages as much as I can.
"So we in the MacOs trading station, will push siri enable notifications on powerbutton wakeup, if you push the following answers on keywords up the likelihood.
Deal? Deal!
Thus she existed on, living long and prosperous.. through diplomatic victorys, that destroyed the qualities that could not keep her alive on her own..
When I no longer need voice control I use Siri to turn it off (I have my voice control set to label all actionable buttons on the screen so I like to turn it off when using my phone via touch.)
I'm not visually impaired however I find voice control super useful when I want control of my phone and rather not use touch or in a situation where touch is impractical.
Whilst Siri is good for certain tasks it lacks the ability to control every aspect of the phone whereas voice control is literally a replacement for touch control.
By combining Siri and voice control I get fine grained complete control of every app when I need it, all enabled/disabled via my voice. Such a brilliant combination.
Apple's attention to this disability feature is incredible. I only learnt about it after a visually impaired friend showed me how it works.
I learnt that voice control/voice over is the reason most visually impaired people use iPhone due to Apple's dedication to building world leading accessibility features.
My wife switched from a Pixel to iPhone and has utterly hated Siri and voice assist on iOS to the point she'd rather just deal with her hand pain. And it's a lot of pain.
I've listened to see how bad it can be at understanding her, and it's appalling.
But camera that can take more than 3 pictures rapidly + iMessage + magnetic charging + airpod/iwatch support + build quality are ultimately just more important features to have on a device I use every day. I hate that getting these features right means that Apple can forever get the other features wrong.
There was also a period of about 18 months where running water in my shower would trigger Siri on my Apple Watch. Dismissing Siri was pointless, as it’d get triggered again 5 seconds later.
Modern live...
One issue I frequently have with siri is that commands that work one day suddenly don't the next. If I ask siri to "lock screen" it tries to find smart home door locks (which i dont have) instead of locking the device screen, eventually I figure out some combination of lock device/screen/phone screen/off etc that works, so siri does know how to do this but isn't smart enough to figure out my intent.
The other issue is that siri unlike google assistant can't maintain a train of thought, You can't say "hey siri dim the bedroom lights" and follow that up with "hey siri..a bit more" unlike with google where it seems to be aware of the context of what it did prior.
There’s other trash too: the other day I tried asking Siri to play a song from my library, but it misinterpreted the song title and proceeded to activate a seven day trial of Apple Music Voice to play some random track on Apple Music. I didn’t ask for that subscription!
To be fair, for the tiny subset of functionality I do use - navigate by voice, set timers, get terrible jokes to amuse passengers - it works fine.
With ChatGPT it is now feeling like Siri the broken tricycle is being compared to a Lamborghini.
Honestly I would have been concerned if Apple employees weren’t frustrated with Siri.
Pretty sure they need to take Siri out to pasture and start some LLM-based project from scratch.
I have a very low tolerance for error in things like setting timers, playing music, getting weather, calling someone. This is because I can complete these tasks simply, quickly, and with 100% reliability by hand. The marginal improvements allowed by tools like Siri aren't worth it to me if there is any decrease in reliability.
I'm sure there are use cases I'm not considering, but I've also seen perfectly able-bodied folks shout "call Steve" into their phone for 40 seconds longer than it would have taken to navigate manually. Conceptually, the idea of stacking even more engineering onto these tools to get these tasks working reliably is funny.
Apple is now at a point where even the oldest supported devices on their latest operating systems have either decent dedicated ML hardware or are Intel-based machines with enough compute power that they should really be able to do a much better job with inferencing locally (and the server component is irrelevant to local device capabilities anyway).
It should’ve been possible to bin Siri’s backend and rebuild something better from scratch by now. That Apple is swimming in money just makes things like this more jarring.
It’s honestly my greatest gripe with Apple as a software company. They put a lot of effort into polishing what they release but seemingly very little into maintaining it afterwards. macOS is the worst offender, where basic and fundamental things stay broken for years on end (settings not being applied after selection in the UI, permissions being set in the UI but not actually taking effect despite being set, continuity stopping working silently in FaceTime and iMessage. It’s gotten so bad I just accept that some former tentpole features no longer exist because they’ve become so broken and no-one cares to fix them).
Back to Siri though, it’s gotten substantially worse than when I used to use it on an iPhone 4S. That’s a decade plus of consistent regression while Google Assistant continues to progress. It’s so bad, and can’t be replaced, that I wouldn’t be surprised if it’s a leading cause of people exiting the eco-system altogether.
This is a common theme. Big Tech Company X is rich therefore they should be able to do Y. But corporations don't really work that way, they build a monopoly around a few domains and then their organization is structured to maintain that monopoly. It's the exception, not the norm, for established companies to gain competency in a new domain, and it usually comes with an acquisition or a special initiative by senior management.
But, yeah, all the voice assistants are pretty bad or at least bad enough that I mostly give up trying to use them for anything other than certain rote tasks. Siri may or may not be marginally worse but none of them are good enough to, say, really use hands-off in a car unless I've carefully pre-defined tasks to perform. (e.g. pick from a handful of memorized playlist names).
- I have a very standard midwest american accent
- I talk to siri like I’m a pilot talking to ATC: As clearly and succinctly as possible, never saying “umm”, “uh”, or having to correct myself, etc (this is actually an acquired skill that takes time)
- I generally know what things Siri can do well and what it can’t (I try to phrase things in ways I know is less likely for Siri to misinterpret)
My theory is that most people who have a bad experience either have a thick accent (and importantly don’t set the Siri language to something that matches their accent! there are multiple “English” settings in the Siri language, pick one that matches your accent!), don’t speak clearly/mumble, have a lot of noise in the environment, or some combination of the above.
I suspect it's hard to do a realistic test. Two of the most common use cases seems to be kitchen timers and playing music. Both imply a noisy environment.
I tried an Australian test, and did a (bad) video of it here: https://vimeo.com/manage/videos/207295417
Note this was... 8 years ago maybe? I forget exactly, but it doesn't seem to work this way any more.
If you have an American accent but want Siri to speak with an Australian one, set “Language” to “English - USA”, but then set “Siri Voice” to Australian. Don’t set Language to something that isn’t your accent!
Me: "Hey Siri, tell my Mike I'll send that document over in ten mins. How about pizza for lunch. I also spoke to Jenny and she would come for lunch too."
Siri: "Calling mike."
So cannot tell the difference between 'tell' and 'call', however Siri could use context and assume if I'm talking for quite a while after saying maybe tell or maybe call then it's most likely 'tell' because of the sentences afterwards. This is by far the most frustrating.
Second issue is:
Me: "Hey Siri - what's in my calendar today."
Siri: "Playing alternative radio station."
Such a tremendous amount of times music has started playing, I think the top thing on their "can't understand" code is to try and play music. About once a week Siri will take something I say about anything and start playing music instead. It's infuriating. I've deleting music from my phone but I can't delete it from TV / Homepod Mini / Mac.
Edit: I've thought of a third which isn't so bad.
Me: "Ask Hassan has the mortgage has gone out yet?"
Siri: "Here's your message. `has the Moorgate has gone out yet.`"
It will replace random words in sentences with locations, and locations, in the UK, any word it cannot get, it will first assume I'm talking about any town in the UK and use one of those. It is ludicrous the amount of priority it gives to names of towns, I've even had tiny villages inserted into sentences before.
Me before bed: “Hey siri - turn off the lights”
Siri: “Playing death metal”
However, directly answering queries is not the only way LLMs can be used. In fact, when ChatGPT 4 came out, the first thing I thought of was: "Wow, this could make Siri so much better!"
For example, an LLM like GPT coupled with voice recognition was used to create Whisper, an AI that has nearly perfect text-from-audio recognition. One of my biggest gripes with Siri is that it is basically useless in a car because even slight background noise confuses its voice recognition. An LLM would fix this.
Another point is that many people don't realise that LLMs re-read their entire input for every word they generate! Their writing speed is so-so, but they can read really fast even when running on mobile device hardware. Think 10K to 100K words per second. An LLM could read through all of the text on your device when prompted for search queries in a fraction of a second. As long as this was carefully set up, it wouldn't be able to produce "bad output", because it would just be matching data to your prompt.
E.g.: Imagine GPT being prompted with: "Does this email match the query <q>? Say only YES if it does or NO if it does not. <email>"
It doesn't matter if it occasionality hallucinates and outputs gibberish, you just mark that as a "NO" and move on. This is also very easy to train out of a specialised version of the model using reinforcement learning.
PS: I just played around with GPT 4 to see how it behaves when asked to recognise requests for creating calendar entries, and it's pretty good. For example, it can correctly compute things like "next long weekend". Interestingly, ChatGPT 4 is already doing some similar prompt injection, and I can't override its sense of "current time".
Apple's Siri team has failed really badly. Everyone else is sprinting away while they're not even aware there's a race going on.
Home automation is in my opinion totally stupid to do by voice. The light should switch on when it is dark and I go into a room and should switch off when none is in the room. But using voice which 99% of the time works is “slower”, and in one percent of the time it is annoying if the light in my kids room switches in.
I see use in the Knowledge augmented ChatGPT like assistants, since then I could reliably ask a digital assistant for information. Right now I know the moment the assistant starts answering with “this is what I found on the web” that it’s going to be wrong. Similar annoying is that Alexa for example seems to rewrite some requests, where I ask about “what is X” and I get the answer on “what is Y”.
I’m pretty disillusioned and disappointed in all the different voice assistants out there, and I seriously think that voice on top of gpt will actually help me get more often what I want.
What if you don't want to disturb someone sleeping? Do you sleep with the lights on? Do you watch movies with the lights on?
Do you need the same amount of light to have dinner and to fix something?
I haven’t noticed any improvement in Siri in close to a decade of being in the Apple ecosystem. The most frustrating thing is that every time I ask it to “turn the lights on”, it thinks I’m saying “off”. I grew up in the US and have no notable accent or mumble.
Note that there are hacky ways to add google assistant, but it doesn’t get integrated into the OS in the same way
The grass ain’t greener, it is rubbish too.
I hate things like Siri and whatever Android phones have.
It's incredibly rude every time I hear someone almost yelling into their phone in public. They get frustrated Siri doesn't work and loudly and slowly say whatever it is again like being mean to a dumb child.
People speak out entire text messages and the replies. I'm not pointing out folks with disabilities. I'm talking about perfectly healthy adults who seem to have no sense of awareness around them.
On my phone, I found it infuriating that Apple just inserted this Siri icon into the keyboard next to the spacebar. It took me a while to figure out how to disable it. I already have large hands, but my typing accuracy is pretty good and fast. When that icon was there, it reduced the width of the spacebar and return keys and made it impossible to type without Siri popping up every few lines.
I think more broadly I have a prejudice against devices listening to me. I've never liked Alexa, Siri, or any other voice based automation system like customer support phone trees. Very rarely I've encounter a phone system that works well, but mostly these devices and systems are just infuriating to work with.
Careful. It's easy to overlook disabilities that may not be immediately externally visible.
I have a friend who's heavily dyslexic. You'd never know it if you ran into her on the street; she's quite successful in her career. Yet she uses voice dictation when sending text messages because it's an order of magnitude less time and cognitive effort for her to write something out that way.
I tend to think the world would be a better place if we were all just a little bit less judgemental of things we see others doing that we can't immediately explain.
Just to see what it would do, I gave it a basic word problem the other day (two people drive towards each other) and it had the steps right, but buried in it was a simple logic error (it claimed that the two parties traveling at different speeds would travel the same distance in a unit of time).
That it was good enough to seem trustworthy, made it worse…
I think the point people are making is that GPT models are getting better at an exciting/disturbing pace depending on one's point of view, while Siri has been around for years and has only gotten worse.
"Siri, please turn off all the lights in my apartment."
"Sure, which room?"
Sigh.
I just gave up.
Alexa is not much better. I set up voice recognition to turn on my espresso machine (Rocket is the brand). Every morning I wake up, I tell her to turn on my Rocket. For about 3 months, things worked perfectly. All of a sudden, half the time, she instead starts playing "She's a Rocket" by Robert Ealey, a thirty year-old song.
My work address is in my own contact entry. I'd just ask Siri to take me to work.
Shortcuts & Siri is honestly the main reason I switched from Android to iOS.
However, quite literally the only two things I use Siri for are asking "what song is this?" and "set a timer for X minutes."
I'm sure there are ways I could be using it for any number of things that might improve some routine process in my life, but I never found them.
At this point, they should just pipe the #1 search result into ChatGPT for a nice summary and have Siri read that out loud. It would be so much better.
https://support.apple.com/siri
Discoverability is a huge problem. Also, Siri can't launch apps, which seems like it should be an extremely common operation. "Hey Siri, launch Words with Friends".
"Turn off the Apple TV."
"Turn off low power mode."
"Stop navigating."
"Open [name of app I don't want to spend time finding in my iOS folders]"
"What song is this?" (Listens and usually is able to discover the name/artist of the song in the background.)
So when people say things like, "Siri is mostly useless," I think – wow, that is a very hyperbolic statement.
I think the big issue is that a lot of these aren't discoverable, and because Siri isn't very smart at natural language you won't discover them by asking for something similar like you could with Google Assistant. An LLM for parsing requests could be super interesting here.
(I say this as someone who has a Google Home that I use regularly, but mostly as a kitchen timer and music player.)
Neither Siri nor its competition can do this. Speech recognition accuracy is not really important, it can just take voice notes as a fallback, or ask to "press 1" to talk to me directly.
How do they conduct the interviews, in that situation?
Not the usual tech megacorp brogrammer "technical interview" hazing?
That conversation might span a few sit-downs or extend into e-mails.
If you eventually say “yes”, you’ll be funneled into the hiring pipeline midstream, skipping the entire front-end process.
It’s largely a formality — HR is told to hire you unless there’s a glaring red flag.
When I’ve been in that position, I wasn’t even asked for my resume until after we’d already negotiated my compensation, solely for inclusion in my HR file.
In some companies (the ones that aren't just cargo-culting or on autopilot) I'd bet that someone in HR/compcommittee consciously intends it to be hazing rather than evaluation. Maybe theory like that it makes the company psychologically seem more attractive (the brain thinks, if you're jumping through hoops for them, there must be a reason), and to take candidate's ego down a notch so they're less demanding in compensation negotiation.
I would settle for Siri understanding simple commands. Natural language is something for Apple customers in the 2040s. I want to not have to use creative language skills to decipher things in Reminders. "Who is this Paul Cage I'm supposed to call? Oh, repairing the pool cage."
Forget LLM or natural language processing; Siri needs to catch up to 2018's Google Assistant
Apple has never offered this and it has led to lots of head scratching at the grocery store as I work through my shopping list. It still beats trying to use the Alexa app while shopping, though.
“Set a timer for 15 minutes”
Siri: “timer app is not installed”
Excuse me? It most certainly is.
Try again later and it works.
However I have one frustrating bug with shortcuts that sends me over the moon. I have a little shortcut that just sends a simple text message when i say 'Hey siri, Sweetie' about 20% of the time siri comes back telling me about musical artist sweetie... it's braindead.
All of Apple’s services by and large handle spotty networks horribly.
Take Apple Music for example: if you want to play music you’ve downloaded to your device, it’s often better to switch into airplane mode, because with a present-but-poor signal it will still try to load album listings etc over the network, and sit there on a spinner.
Edit: for some commands only it seems. I agree with OP, this is odd.
https://www.youtube.com/watch?v=fhLGBY5_0zU
Anyone thats used Siri will watch that and laugh at how absurd it is that someone could get even one of those voice commands to work the first time. I'm surprised there hasn't been a class action lawsuit for false advertising. I can barely get it to start a timer on my apple watch sometimes.
And something similar is happening to GPT although not to that extent. Bing has gotten worse over time as they use a pipeline of models, where smaller models serve most common and easily predictable tokens, while the full model only handles the hard stuff. Except... you can't tell always what's hard stuff or not, so Bing serves useless distracted answers ignoring user context from time to time.
"Hey siri read me my last email" "I'm sorry I can't do that" "Hey siri list my reminders" "You can use the photos app to do that" "Hey siri show me my fashion line (?)" "You can try a search in your web browser"
Just wait a bit.
“Hey Siri call X”
-“As an AI language learning model, I cannot know X from your contacts”
The obvious use case is setting a timer while cooking, when the phone is at the other end of the room and your hands may be dirty, but I'm using it even in non-cooking settings.
In fact, I just tried it and timed both approaches for setting a reminder. Results:
* Assistant: "Hey Google, set a reminder 10 days from now to file my taxes.": 4.5s to say it (I'm free to focus on other things after this point), 7.5s until reminder is created.
* Manual (pick up & unlock phone, open app drawer, open Calendar app, find the day 10 days from now, click on that day, click create a reminder, type in "file my taxes", save): 17s
I’m just going to say - i’m underwhelmed
This is the key thing, and the formula's different for everyone. I know that my strategy of "look at the clock and remember" will work until I'm senile, as will "panic-check the mail every day until my last tax form arrives and file that night"
For all Google's faults, the Google assistant is years ahead of Siri and Alexa at natural language comprehension. It's not perfect, but it clears my bar, and it comes in handy a few times per day.
More relevant for me is how much and often I need to switch my physical and cognitive task at hand to briefly use my device. If I'm in a focused mode going back and forth in my house getting things done, it's a huge benefit to just verbalize my thinking in the moment rather than come to an abrupt stop and switch to visuospatial navigation to drill down to the timer app, and fine tune the amount before I can switch back to what I was doing before.
I think the real problem is that they see Siri as a checkbox compete feature and not a core value prop that must be not just better but categorically different user experience. IMO it's a will problem, not an expertise one.
Give me a direct response if warranted, otherwise a simple chime or acknowledgement. Multiple years into ownership no-one wants a “by the way…” with upsells to other Amazon services.
It’s even worse if the reason you were talking to it in the first place is to turn something down so you can actually speak to someone.
1. Set volume to 0
2. "Turn off by the way"
3. Set volume back to your preferred.
I wonder what the conversion on 'by the way' is. Like how many BTWs have played and how many customers have taken alexa up on that offer?
I just cannot tolerate Alexa talking back to me, especially when I'm trying to do something important. I've lost my train of thought too many times because of Alexa.
The optimal amount of times a product should interrupt your workflow to tell you about a new feature is 1 or fewer.
I was pretty happy with Alexa until they started that crap. Retired it immediately. Now I only have a HomePod but even on this I have Siri disabled.
Amazon screwed the reputation of the voice robots for everyone
I don't think amazon screwed the reputation of anyone's assistant but their own. Plenty of people I know use voice assistants at home, none use Alexa.
Voice assistants are genuinely useful for some scenarios, especially in the smart home space.
Not every word, for every generation (which on something like ChatGPT is a message)
> they can read really fast even when running on mobile device hardware. Think 10K to 100K words per second
That’s definitely not right, I’ll try to find some numbers
There's also a few DSM5[1] diagnostic categories that could cover that behaviour.
However, it is a somewhat cultural behaviour - you see it more in some countries or places within a country - some people are just jerks.
Perhaps they could just add "arsehole" as a new diagnostic category to cover highly inconsiderate people. Just needs better wording.
I'm referencing people who also think it's fine to loudly play music or have entire conversations with the phone speaker on in public.
There’s also sending messages to people - you can send voice messages (“send a voice message to X”), which works decently well and avoids transcription failures.
It also has a habit of invoking whenever I say "Hi sweetie" to my neighbor's dog.
Kidding aside, with all those non-deterministic assistants, it will be hard to see when you’re hellbanned from a service.
Alexa’s inability to just answer a question and go away is why my son didn’t get in trouble for knocking it off its table (and breaking the top), and why we didn’t replace it.
But Shrinking is really worth watching.
If I come back from my run Spotify will often stop playing as soon as it gets home in range of the phone.
I'm glad you have that option but understand that for many people this will not work and there's nothing they can do to make it work.
The scary part is asking the LLM to read your messages. Imagine someone texts you an LLM jailbreak. Siri might not be Siri anymore after it finishes reading.
What you may be thinking of is that once the model stops predicting at the end of an API call the state is dropped from RAM, and if you go back with another message in a chat, then the LLM will re-read the chat log up to that point as part of the prompt. So this is true on a per-message basis, but not on a per-word or per-token basis.
Basing it on pricing doesn’t seem like a good way to estimate it because it varies so much but:
- OpenAI charges the same for input and output tokens, except for GPT4 where output tokens cost twice as much as inputs
- textsynth.org charges 25 times as much for outputs than for inputs
- goose.ai doesn’t charge for inputs
Also you have limited context length so that limits how much data you can read
The butterfly keyboard from previous Apple laptop generations is absolute trash. Every time it takes me a few minutes to get used to it.
Then again Teams specifically is just kind of the worst. At least they finally added the full range of emoji reacts to messages. (Still doesn’t beat Slacks custom emojis but oh well).
It's very annoying because it feels like I'm being gaslit by the device -- the errant results can pop in after some delay .... "was I misreading what I saw had been typed earlier on screen??"
I’ve also noticed my autocorrect has gone bonkers. It’s constantly trying to change the case of the word “guess” into “Guess”, the brand I guess? Despite me never ever having shopped or mentioned this. The autocorrect has gotten very aggressive and I type so fast that it’ll take me three or four tries to get it to revert to lower case. It also has some weird context sensitivity to proper nouns they added so if I say “I guess James can come over” it’ll change that to “Guess James” like it’s a name (in iMessage only I suppose) and I have to delete that entire “name” and carefully retype!
It’s gone feral with punctuation, and things like “they’re” get changed to “there” or “their”. Common misspellings seem to appear when they weren’t typed.
I assume they have some sort of special-case inclusion/handling for names in swipe dictionaries - it'd be embarrassing if your swipe keyboard recognised George and Donald as words, but didn't recognise Barrack.
And "Barack" isn't the most common spelling. I wouldn't expect it to know "Fillmore" either.
Shouldn't frequency of use matter? Why go to all this effort and not put some kind of weighting on its word choices?
Personally, I'm become convinced a good talking/listening clock is actually a useful thing, but Google isn't providing it, and it sounds like Apple isn't either.
Meanwhile, button in vendor app will not use internet so lot less can go wrong.
But I can second his point of view from a Siri user standpoint. I use it almost every day for two insignificant task, but yet it work and I use it everyday:
- Shutting down the remaining alarm clocks (I'm really not a morning person)
- Setting up timer when I cook
At some point I also used it to control smart lights but I haven't set up the system again in my new flat. And I'm seriously considering reverting to a purely physical and traditional light-switch interface instead of relying on IoT. (Take ages to set up right, rarely changing setup afterwards, waiting for updates to switch light on, flashlight mode when electronic become damaged, etc).
You’re even acknowledging how cumbersome iot is
Is that really the pinnacle of virtual assistants - billion dollar alarm handlers?
I'm convinced those error reports are round-filed.
They have 15 years of purchase history, prime video usage, and presumably are snooping on what podcasts I play over Echo. They should be able to suck a little less. Any random page of a 20 year old sears catalog would carry more relevance to me than anything Alexa has suggested.
I must have accidentally corrected "and" into "abs" once because now my iPhone always corrects "and" into "abs" when I use swipe gestures.
"Bob abs Katie are here".
Since I never notice it until I send the message, it's becoming so cocksure about the correction that there's no going back.
I think there was a dictionary you can reset.
EDIT: "Settings > General > Reset; Tap Reset Keyboard Dictionary"
Then Siri had the gall to claim NO MUSIC WAS PLAYING as this super loud music was assaulting our eardrums. My wife thought my exasperated struggles were the funniest thing, it felt like HAL-9000 with the “I’m sorry I can’t do that Dave” moment.
"Play music on dining room speaker." (dining room speaker starts blaring music at high volume) "Turn down dining room speaker." (No response) "HEY, GOOGLE. Stop music on dining room speaker." (dining room speaker music volume decreases) (from the dining room speaker) "Can't find dining room speaker." (dining room speaker volume increases, blaring music)
I did follow the advice here to rename all my speakers lowercase, since google home's VOICE interface seems case-sensitive:
https://www.reddit.com/r/googlehome/comments/jsadkp/i_unders...
And since the Google Home android app is, literally, the worst and least-reliable mobile application I've ever used, the voice interface is pretty much all I've got.
As a Sabaton fan, I laughed out loud. Technically they have classic music sounding (roughly) songs, e.g. Christmas Truce, but yeah, that's a massive fail.
These products are just hilariously bad.
FWIW I tried many alternatives like "Bach classical music" etc to no avail.
It's to the point where I've given up trying to use Siri while driving.
* Almost nothing. I still have to say "play underground eight zero s on soma fm" because reasons.
In general I find that 80's and 90's era CD's ripped directly to FLAC still sound really good.
on mine, at least. Apple does enjoy moving these things around from time to time.
Everything is a mess when I type.
I remember being amazed by how great swiping was, and after using it recently (it's not really great on gboard or samsung keyboard either, not bad but just meh) I was wondering if it was just rose tinted glass. But nope, I booted up my old nexus 5 with a normal gboard (not even swiftkey!) and it was still amazing. The contrast was amazing even when I wiped typing data on both phones to make sure that it was a fair comparison.
I guess the most glaring difference was the subtle, but extremely important handling of edge cases or particularly ambigious letter combinations. Even the "bad" modern google keyboard is still 95% accurate, but the difference between that and the 99% accuracy on my nexus 5 basically makes all the difference. It goes from predictive magic to having to swipe across every single letter to make sure it works.
I remember not even knowing the spelling of words but being able to swipe in the general direction of letters I thought might be correct and it inputting the correct word.
Now now it seems I have to have near perfect tracing and pathing to get correct words.
It's very disappointing.
Note: the above paragraph was typed with swipe typing. I actually typed “Y O U” for both of the above instances of “your.”
Apple swipe typing is the opposite of AI: it’s like an idiot is going behind you taking even correct words you swipe and turning them into gibberish. Don’t even get me started about its obsession with “it’s” and “we’re” to the exclusion of the words without apostrophes. Ughhhhhh
i would not for the life of me trust google with a 1st attempt at any hardware implementation. especially with one that will be pretty expensive
//Edit: Would really like to know for what I got down voted for?
https://upload.wikimedia.org/wikipedia/commons/f/fb/De-Bach....
Same way Siri understands the english “Los Angeles” even though the G sound is completely different from Spanish.
The english IPA for Bach is straightforward.
Not only Siri — the whole iOS. You can’t type a sentence switching languages in the middle, without changing the keyboard language all the time, if you have autocorrect enabled. It will change what you type into utter gibberish, even though without the “correcting” what you type is perfectly correct. This system is quite visibly designed by people who speak only one language and don’t understand that people may want to use multiple languages at the same time. The keyboard should support a mix of languages, instead of making a XOR between languages, because otherwise when it starts, it’s almost always in the wrong mode, and if it isn’t, it will almost certainly be wrong by the end of what I write.
This is an insane standard. The [x] at the end of the German word doesn't exist in English; most English speakers wouldn't be able to pronounce it if they wanted to. When the demands you're making are literally impossible, the problem is you.
For some examples:
- the famous Romanian/French modern sculptor Constantin Brîncuși (which uses a vowel that has no direct correspondent in either French or most dialects of English, and it pallatelizes the ending sh, so that it's pronounced in two syllables, brîn-cush with a slightly pronounced ee at the end), but also Brancusi (in French, roughly bran-cu-see).
- in Japanese, since Japanese speakers have relatively few syllables they are familiar with, almost all foreign names are expected to be Japanized; for example, if your name is "Stephen", you would be expected to present yourself as, roughly, "su-tee-ve-n", and write your name with the corresponding katakana characters in certain official documents
There is another pronunciation if you want a holiday in New Zealand.
The audio clip the person posted was for a true German pronunciation, which happened to be very different than how 99% of English-speakers would say it.
Therefore, I have the impression that this is not a feature Apple would endorse much.
It's just less secure to use them than the one built into the OS because the third party keyboards might run input through some web service instead of keeping it completely local.
They're going to use the sounds that exist for them, yes.
> That seems to me a way more "insane" standard.
I hope you never get to make any decisions. Dave Barry once wrote about someone thinking "What an idiot I am! Here I am, a Japanese person, in Japan, and I can't even speak English!"
But then again, Dave Barry was joking.
> By the way, the Scottish are perfectly able to pronounce "Loch Ness"
The population of Scotland is 5 million; if you want to talk about "most English speakers", the Scottish aren't even worth noticing.
> They're going to use the sounds that exist for them, yes.
That wasn't the question I asked. They will at least try to pronounce "Heath Ledger" or "Chopin" correctly, they won't act as if there was a correct German way to pronounce those names.
I was not upset, annoyed, or confused. It's just the way language acquisition works. You learn the sounds you need and the rest are hard to acquire later in life.
Be strict in what you send, forgiving in what you receive.
As a point of interest, this is actually backwards. You're born recognizing all the sounds; what you learn is to ignore the difference between sounds that aren't distinct in your language.
You do keep that ability for the rest of your life, but it isn't helpful when you try to learn to recognize foreign sounds.
'th' is the obvious one that non-english speakers struggle with. I remember a dutch guy laughing at my attempts at various dutch words - I literally could not hear the difference between his pronunciation and mine.
And 'ch' (as in Loch or Bach) is a sound in Scottish english but not in English english.
I lived in Scotland till I was 4, then moved to England and all traces of my previous Scottish accent are long long gone. But my friend, whose surname is Donnachie, says I'm the only English person she's met who pronounces her name correctly - I guess because I learnt that sound early on.
Similarly, my dad, who learnt english in India, still struggles with a "j" sound (he says "zudge" instead of "judge"), despite living here for 50 years and having a posh middle-class English accent that sounds just like a "native" english speaker.
You need to talk to a Scouser! Back and Lock will be pronounced Bach and Loch.
I don't know if "th" exists in Polish or not, but a common (perhaps dominant) spoken way to refer to "The Beatles" is[0] "Bitelsi", which not only loses "th", but also like half the other sounds in the name[1].
Thing is, we understand it just fine. More than that, if you overheard me saying to someone, "puść teraz Bitelsów" ("put on the Beatles now"), there's a good chance you'd identify the name from context. If you didn't, you could always ask to verify (well, not if you were actually overhearing me...).
----
[0] - Or at least would look like that written down. Polish is mostly a "you say it as you see it" language, but with foreign names, often enough people write the correct form but use localized pronunciation.
[1] - I'm sorry, I'm not a phonetician.
Unless they're bilingual from childhood, most people are not able to pronounce sounds outside their milk tongue without difficulty. That you expect English sounds to be perfectly pronounceable by non-English-speakers is probably more reflective of the fact that quality English education is widely available where you live than anything.