- #Artificial Intelligence
• 29 min read
AI in Software. Smart Assistants
With the growth of technology, big tech giants have been developing their own smart assistants to provide users with helpful resources about a given query. These smart assistants come in many forms, ranging from virtual chatbots available online to physical digital home assistants like Alexa and Google Home. Each comes with its own set of visual representation, user interface, user experience, interactions, features, and accessibility options.
We study the key features and differences of 7 assistants that made their mark in this market.
Clippy (Microsoft Office)
Google Assistant (Google)
The Office Assistant is a discontinued intelligent user interface for Microsoft Office that assisted users by way of an interactive animated character that interfaced with the Office help content. Part of Microsoft Office for Windows (versions 97 to 2003), Microsoft Publisher and Microsoft Project (versions 98 to 2003), Microsoft FrontPage (versions 2002 and 2003), and Microsoft Office for Mac (versions 98 to 2004).
None (but it can emit simple sound signals).
The assistant is proactive and would show the user tips with several options to choose from.
Clippy becomes animated on click and has various animations to show for fun.
- Clippy's most frequent behavior is to show a text bubble with the phrase "It looks like you're writing a letter" and offer to help. That may be good if you are writing your very first letter, but if you aren't (as most users weren't) it becomes "infuriating".
- If you are idle at your computer – talking to someone in real life or thinking what to write next – Clippy would start knocking on the screen to get your attention.
What can it do?
Clippy is the only one of all smart assistants who's breaking the 4th wall with the user by tapping on the screen, making itself look not virtual but real and being behind the screen.
Clippy is the only assistant that has a set of characters to choose from, with different animations for each one.
Dag Kittlaus, a software tycoon from Norway, and Adam Cheyer, the co-founders of the business that created Siri (later bought by Apple), picked the name in part because it means "beautiful woman who guides you to victory" in Norse. The word is now sometimes used as a shorthand for "Speech Interpretation and Recognition Interface," according to Wikipedia.
Siri is a virtual assistant that employs voice commands, gesture-based control, focus-tracking, and a natural-language user interface to provide information, offer suggestions, and carry out tasks by sending requests to a variety of online services. With prolonged use, it learns the unique language usages, queries, and preferences of users and returns results that are tailored to them.
Male, Female (also with accents).
Apple's Siri voice assistant supports 21 languages: Arabic, Chinese (Cantonese, Mandarin), Danish, Dutch, English, Finnish, French, German, Hebrew, Japanese, Korean, Malay, Norwegian, Portuguese, russian*, Spanish, Swedish, Thai, and Turkish. It also supports a wide variety of dialects for Chinese, Dutch, English, French, German, Italian, and Spanish.
* In MacPaw, we believe that russia is a terrorist country, and we refuse to capitalize its name.
Siri supports both the fullscreen view and the widget view on iOS.
On the desktop, Siri functions only in the upper right corner of the workspace.
Ways to call Siri:
- By voice (Hey, Siri)
- A single tap or long hold on the side button (or Digital Crown for Apple Watch)
- Tap a dedicated Siri button on the latest Mac devices
- "Raise to speak" on Apple watches
- "Dictation" button on Mac devices
The first interaction with Siri is the voice setup so that it recognizes the user's voice in the future.
Animation & Sound
What can it do?
Siri also provides a ton of pre-programmed answers to humorous queries. Such inquiries include "What is the meaning of life?" and "Why am I here?"
Apple puts a lot of effort into the accessibility of its products, and Siri shares all iOS accessibility features. In addition, there are some Siri-specific accessibility features, that allow the user to:
- set how long Siri waits for them to finish speaking
- type instead of speaking to Siri
- control voice feedback for Siri (Don't Speak in the Silent Mode, Only Speak with Hey Siri, or Always Speak responses)
- use "Hey Siri" when iPhone is covered or facing down
- hide the other apps when Siri is active
- have Siri hang up the phone and FaceTime calls
In July 2019, a then-anonymous whistleblower and former Apple contractor Thomas le Bonniec said that Siri regularly records some of its users' conversations even when it was not activated. The recordings are sent to Apple contractors grading Siri's responses on a variety of factors. Among other things, the contractors regularly hear private conversations between doctors and patients, business and drug deals, and couples having sex. Apple did not disclose this in its privacy documentation and did not provide a way for its users to opt in or out. In August 2019, Apple apologized, halted the Siri grading program, and said that it plans to resume "later this fall when software updates are released to the users". iOS 13.2, released in October 2019, introduced the ability to opt out of the grading program and to delete all the voice recordings that Apple has stored on its servers.
- Siri also integrates with HomePod which allows the user to create sequences for requests.
Siri can recognize the voices of up to six different family members on the HomePod mini, and create a personalized experience for each person. So the music Mom hears when she asks for something she'd like is totally different from what the kids hear when they ask.
Siri also includes an intercom capability that enables it to broadcast announcements to other iOS devices or other Home Pods across the house.
Siri is the only intelligent personal assistant that performs accents and employs both male and female voices.
Siri also works with the Shortcut App in iOS that's creating a sequence of actions, such as "Text Last Image" or "Attach a note to Ulysses sheet".
Google Assistant is a virtual assistant software application developed by Google that is primarily available on mobile and home automation devices.
Google Assistant has 10 voices available for English speakers in the U.S. There are 6 female and 4 male voices to choose from. The male voices are called Orange, Green, Blue, and Pink. The female voices are called Red, Amber, Cyan, Purple, British Racing Green, and Sydney Harbour Blue.
Google Assistant supports Danish, Dutch, English, French, German, Hindi, Italian, Japanese, Korean, Norwegian, Spanish, and Swedish. It also supports different English, French, Spanish, and German dialects.
You can use up to 3 languages with the Google Assistant on your device: your Android language, plus 2 Assistant languages.
- Smart Speaker
- Voice prompt from the user "Hey Google"
- Tap on button
- Magic tap (two fingers on screen tap)
Interaction (Animation & Sound)
What can it do?
Manage routine tasks: send a text, set reminders, turn on the battery saver, and instantly look up emails
Plan the user's day: check the flight status, make a dinner reservation, check when a movie starts, and find a coffee stop along a route
Provide entertainment: control music on YouTube Music, resume podcasts on Google Nest mini or smart display
Make memories: find and take photos
Get answers: the latest on weather, traffic, finance, or sports, and quickly find translations while the user is traveling
Control the user's smart home devices: adjust the temperature, lighting, and more
Google Assistant is now available in Snap Core First (a symbol-supported AAC app, developed to help people with language disabilities communicate).
Google assistant specific accessibility features:
- Voice control (thermostat, home pod)
- Telling the user who's at the front door (+ unlocking the door with a password)
- Verbose mode (assistant reading options on screen)
- Google home sounds (when the user can't see the lights)
- Adjusting equalizer
- Continued conversation (keeping the microphone on for longer)
- Screen reader
- Using assistant on Chrome books
In July 2019 Belgian public broadcaster VRT NWS published an article revealing that third-party contractors paid to transcribe audio clips collected by Google Assistant listened to sensitive information about users. Sensitive data collected from Google Home devices and Android phones included names, addresses, and other private conversations after mistaken hotword triggering, such as business calls or bedroom conversations. From more than 1000 recordings analyzed, 153 were recorded without "OK Google" command. Google officially acknowledged that 0.2% of recordings are being listened to by language experts to improve Google's services.
- Best-in-class speech recognition
- Great general question results
- Remembers context
- Good with home devices - but no notes or texts
Bixby is a virtual assistant developed by Samsung Electronics. It represents a major reboot for S Voice, Samsung's voice assistant app introduced in 2012 with the Galaxy S III. S Voice was later discontinued on 1 June 2020. In May 2017, Samsung announced that Bixby would be coming to its line of Family Hub 2.0 refrigerators, making it the first non-mobile product to include the virtual assistant.
Bixby comes in three parts, known as "Bixby Voice", "Bixby Vision", and "Bixby Home" (which has been recently replaced with "Samsung Free" in the latest One UI software update), which Samsung had recently started developing.
Available in US English, British English, Indian English, French, Korean, Chinese (China), German, Spanish, Italian, and Portuguese (Brazilian Portuguese).
- Samsung Phones
- Samsung Dex
What can it do?
- Send text messages, set reminders, read emails, make phone calls
- Read between 1 and 20 most recent messages if prompted with the phrase "Read my latest text message"
- Upload a selfie to Instagram, create a photo album with a name of choice, play a specific artist on Spotify, and even rate an Uber driver.
- Bixby vision (reader, scene describer, color detector)
- Full vocal control
- Bixby is the only one built-in into refrigerators
- Samsung has a dedicated button for Bixby. Although the internet is full of articles on how to change it from Bixby to Google Assistant.
- Bixby Alarm
- Thanks to Bixby's iterative deep learning technology, the more the user uses the interface, the smarter Bixby gets
- It can do multi-step tasks (like open and download files)
- Standart tasks still tip
- Recognition in louder places is poor
- Not much app support
- Smart home support is poor
- The only assistant with visual recognition
Cortana is a virtual assistant developed by Microsoft that uses the Bing search engine to perform tasks such as setting reminders and answering questions for the user.
Available in: English, French, Chinese (Mandarin), German, Spanish, Italian, Japanese, Brazilian, and Portuguese.
Windows Phones (not anymore)
Most versions of Cortana take the form of two nested circles, which are animated to indicate activities such as searching or talking. The main color scheme includes a black or white background and shades of blue for the respective circles.
To activate Cortana, a user can say "Hey, Cortana" if voice settings are enabled, press the Windows key plus C or click the Cortana button next to the search bar on Windows 10. In Windows, the Cortana window then appears and the user can type in their question or request.
Note the Cortana's ability to mimic voices, have a spooky or a funny tone, that are more human-like thank other assistants.
What can it do?
With Microsoft's fading support for Cortana, the digital assistant has lost more features over time. However, Cortana still includes the following features:
- Use of natural language. As opposed to having to use specific words, users can make requests in plain English.
- Multitasking. On Windows 10 desktops, Cortana can open programs, find files and read or send email messages. Users can, for example, type a request to Cortana or turn on the microphone and speak to the program.
- Integration with Microsoft Edge. In Edge, users can highlight a word or phrase and Cortana displays more information on the topic.
- Reminders. Connected to Office 365, Outlook.com or Gmail, users can create and suggest reminders.
- Calendars. Working with Outlook, users can manage their calendars and make appointments or block off time for meetings.
- Accessibility features. These include settings to read overviews of new emails, read high-priority emails or find the best times for meetings.
As for features that were removed, Cortana can no longer be requested to play music, control video streaming services, create location-based reminders, or control smart home devices.
Cortana functions only within Microsoft, and only benefits from Microsoft accessibility features and doesn't have any of its own.
Cortana indexes and stores user information.
Cortana can be disabled; this will cause Windows search to search Bing as well as the local computer, but that can also be disabled.
Turning Cortana off does not in itself delete user data stored on Microsoft's servers, but it can be deleted by user action.
Microsoft has further been criticized for requests to Bing's website for a file called
threshold.appcache, which contains Cortana's information through searches made through the Start Menu even when Cortana is disabled on Windows 10.
As of April 2014, Cortana was disabled for users aged under 13 years.
- The British version of Cortana speaks with a British accent and uses British idioms, while the Chinese version, known as Xiao Na, speaks Mandarin Chinese and has an icon featuring a face and two eyes, which is not used in other regions.
- After March 31, 2021, the Cortana mobile app is no longer supported.
- The name Cortana comes from Microsoft's flagship game series Halo.
- Cortana uses the Bing search engine
- It is good at voice interpretation
- Long responses
- 3rd party apps support is poor
- Poor social app integration
- Not good at learning user habits
Invoke smart speaker powered by Cortana.
Amazon Alexa, also known simply as Alexa, is a virtual assistant technology largely based on a Polish speech synthesizer named Ivona, bought by Amazon in 2013.
In addition to the two main Alexa voices, Amazon also has celebrity voices. Right now, you can choose between Samuel L. Jackson, Shaquille O'Neal, and Melissa McCarthy. Last year, during the holiday season, Amazon added Santa Claus as a more limited "celebrity voice." (He only answers certain holiday-related queries.)
Amazon Alexa can speak English, Spanish, French, German, Italian, Hindi, Japanese and Portuguese.
Dialects: English, Spanish, French.
- Amazon Echo
- Smart Home
- Wearables and earphones
Alexa's avatar has colors transitioning from light to dark (top to bottom), evoking a sense of enlightenment and calm.
The blue and white symbol depicts a white speech bubble on a blue background. The graphic appears to have a double meaning.
While the white cloud symbolizes voice control, the blue background looks similar to the decorative LED strip which runs around the edge of the Echo speaker. The new symbol is also a kind of badge of approval for a lot of products. This is a great way for companies to gain the trust of their audience with a recognizable image.
- also color indicators on Amazon Echo
- Echo Dot
- All Echo Dot sounds
What can it do?
- Tap to Alexa
- Voice control
- Real-time text
- Adaptive listening
- Show and Tell
- VoiceView Screen Reader
- Call Captioning
- Control Speaking Rate
In early 2018, security researchers at Checkmarx managed to turn an Echo into a spy device by creating a malicious Alexa Skill that could record unsuspecting users and send the transcription of their conversations to an attacker.
In November 2018, Amazon sent 1700 recordings of an American couple to an unrelated European man. The incident proves that Alexa records people without their knowledge. The company dismissed the incident as an "extremely rare occurrence" and claimed the device "interpreted background conversation" as a sequence of commands to turn on, record, send the recording, and select a specific recipient.
During the Chris Watts interrogation/interview video, Watts was told by the interrogator, "We know that there's an Alexa in your house, and you know those are trained to record distress", indicating Alexa may send recordings to Amazon if certain frequencies and decibels (that can only be heard during intense arguments or screams) are detected.
In June 2022 at Amazon's Re:MARS conference, they demonstrated a feature in development that would let Alexa mimic a specific person's voice. The example shown was of a deceased grandmother reading a story to a child. The AI is capable of learning a voice from less than a minute of recorded audio. Some people raised ethical concerns about this, specifically the consent of the dead and the potential use of such technology by criminals. It was compared to the episode "Be Right Back" of the dystopian science fiction show Black Mirror where a similar technology was employed
- Alexa's predecessor was Ivona, an assistant developed by a Polish company bought by Amazon in 2013. Ivona was inspired by the 2001: A Space Odyssey movie. In November 2014, Amazon announced Alexa alongside Echo. Alexa's creators took inspiration from the computer voice and conversational system on board the Starship Enterprise in science fiction TV series and movies Star Trek: The Original Series and Star Trek: The Next Generation.
- Amazon developers chose the name Alexa because it has a hard consonant with the X letter, which people recognize with higher precision. They said the name is reminiscent of the Library of Alexandria, which is why the Alexa Internet project used it too.
- In 2021, the BBC reported that because of Amazon using the name Alexa, bullying and harassment of children, teenagers, and adults named "Alexa" has substantially increased, to the extent that at least one child's parents decided to change her name legally.
- Alexa's technical skill list has several peculiarities:
- Multiple-user support (Amazon Echo)
- 15000 skills available
- Not designed to be mobile first
- No calls or messages, the assistant refers to the Alexa app to do that
- Can't open apps or change settings
- Specialized for purchases (via Amazon, of course)
- Uses Bing search engine
These 3D renders were created by the Brazillian design firm Lightfarm in collaboration with Cheil, a marketing company that Samsung owns. The idea behind this project wasn't to use these renders to promote Samsung products but to showcase what a theoretical virtual assistant could look like in human form.
Samsung didn't officially make Sam, or even acknowledge her.
Samsung has not only explained the purpose of Sam, but also what Sam is not designed to be. Specifically, Samsung confirms that Sam is not a replacement for Bixby. In explaining this, Samsung highlights that, in contrast to Bixby being installed on consumer devices, including Galaxy Phones, Sam is "the chatbot of the Samsung Service team."
What can it do?
Sam is a chatbot that's designed to auto-respond to consumer questions and queries via one of many online platforms, including Facebook Messenger.
When the character's pictures were released, Sam quickly became one of the most meme'd assistants ever.
Sam also triggered lots of art and video recreations.
Smart assistants tell jokes (bonus round)
- Andy Warhol once painted a can of bean soup. It's one of his lesser-known fart works.
- A high jumper walks into a bar. The coach says noooooooo. You're supposed to jump over it.
- What does a fart laser sound like? Phew phew
- Which animal is the least trustworthy? Cheats. (roaring sound + a honk)
- What did the vinaigrette say to the refrigerator? "Close the door, I'm dressing". (duck sounds + door closing)
- What kind of music do bunnies like best? Hip-hop. (Jumping cartoon sounds)
- A fly says to a human. Do you have any food? The human says no. The fly leaves. The fly comes the next day and asks, "do you have any food?" - the human says no, and the fly leaves. The fly comes the next day and asks, "do you have any food?", the human says "no, I told you twice already, if you keep asking I'll smash you with a newspaper". The fly leaves. The next day fly flies to the human and asks, "do you have a newspaper I can borrow?" The human says no. The fly asks "Then do you have any food?
- People are shocked when they find out I'm not a good electrician.
- A couple of years ago two snails went on a stroll in the park. They're having a great time.
Smart assistants have come a long way in the last two decades, providing users with increasing levels of utility and convenience. Today's digital assistants are pushing the boundaries of what is possible with AI technology - from being able to understand and respond to natural language prompts to using complex algorithms that allow for greater functionality than ever before.
Unfortunately, this fantastic capability occasionally comes at a cost – almost all smart assistants have serious security issues that can leave their associated devices vulnerable to hacking or other malicious activities. As such, it is important for those who rely on these devices to stay vigilant about any software updates or fixing available for their smart assistant of choice.
This is an independent publication and it has not been authorized, sponsored, or otherwise approved by Microsoft Corporation, Apple Inc., Google LLC, Samsung Electronics Co., Ltd., or Amazon.com, Inc.