​Voice control in digital signage: more than a technical issue?

With voice control now an integral part of many consumer devices – Samsung reports integration of its Bixby voice technology into fridges and appliances, to smartphones and wearables. Bixby started as a practical way to use your voice to interact with your phone. Now, it is evolving into a scalable, open AI platform to support all of the devices in a consumer’s life.

Earlier this year, Samsung introduced the new Bixby – a more conversational, more personal and more useful intelligent assistant. Bixby keeps the dialogue moving forward to get things done. It gets to know you and learns from its users, to personalise how it helps each individual. Bixby serves as a personal assistant, making it quick and easy to complete a variety of tasks from start to finish through a growing number of third-party partners. If a user asks: “Hi, Bixby, give me a restaurant recommendation,” Bixby remembers cuisine preferences, like Italian, and brings those recommendations to the top of the list. Ask Bixby to buy tickets to a concert, and it will ask you follow up questions to find the best seats for the user

Now Samsung is bringing that experience to even more devices and services. To evolve Bixby into Samsung’s intelligence platform, the company is laying the groundwork to build a scalable AI platform, making it simpler and easier for developers to create user experiences powered by Bixby.

At Samsung Developer Conference 2018, Samsung announced that it is opening Bixby Developer Studio, an Integrated Development Environment (IDE), to developers. It offers access to the same development tools Samsung’s internal developers use to create Bixby Capsules, which is what developers build to add features to Bixby. It’s all part of Samsung’s unique approach to AI, with the goal of building a scalable, open AI platform where developers and service providers can access tools to bring Bixby to more people and devices around the world.

“Our goal is to offer developers a robust, scalable and open AI platform that makes it easy for them to launch and evolve the amazing experiences they create for our users,” said Kyunghak Hyun, Product Manager of the AI Product Management Group at Samsung. “As Samsung’s ecosystem of devices continues to grow, it creates more points of contact not just for Bixby, but for a growing number of third-party services as well. Developers have more ways to reach users, and users get more opportunities to make the most of them.”

So how long will it be before Bixby is available for the Samsung wall?

Technical barriers

Unlike an application in a consumer appliance or a car, where communication is effectively one-to-one, voice control of a digital sign has to select one voice command from a cacophony of background chatter. How often have you found the far field microphones installed in voice-controlled devices accidentally picked up a stray ‘Alexa’ or ‘Siri’ command from a bystander?

In a conversation with Jeff Hastings, CEO of BrightSign, he acknowledged that voice does have a role in digital signage, and that the development of near field microphones is something that BrightSign is actively pursuing. Another chain of development is the combination of digital signage and the mobile phone where information flow is bidirectional. This obviously poses privacy problems in one of the most likely applications to emerge, that of medical advice dispensed from digital signs in doctor’s waiting rooms and pharmacies.

Last year Microsoft revealed a push to get OEMs to support Cortana on Windows 10 better, via support for wake on voice and far-field technology. Far-field technology would allow users to access Cortana on their PC from up to 4 metres away, much as an Amazon Echo works now, even while music and noise is playing. It requires a PC include a high-quality microphone which meets Microsoft speech spec 2.0.

Voice on trial

Currently there is something of a battle raging between Microsoft with its Cortana technology and Amazon with Alexa, Critics point out that while the two are often seen as synonymous, there are differences. Microsoft has been spreading Cortana around as much as Amazon has Alexa. The difference is that Alexa is a single, cheap device sitting in a room, its relatively easy to implement and permeate the market. Microsoft Cortana is generally considered to be the more powerful potential solution when compared to Alexa, but the product is lacking the direction needed to exploit and perfect Cortana across the wider market for voice control.

Amazon’s Alexa, on the other hand, is compatible with a much wider range of devices but both technologies suffer a similar disadvantage of being predominantly tied to the English language. While Alexa’s progress might be largely attributable to marketing success, there are signs that the technology is spreading to more serious applications. For example, the NHS is working with Amazon to allow elderly people, blind people and other patients who cannot easily search for health advice on the internet to access the information through Alexa.

Voice activated digital signage could relieve some of the pressure on the NHS, Voice activated digital signage in wayfinding. [photograph Dimension Systems]
The health service hopes patients asking Alexa for health advice will ease pressure on the NHS, with Amazon’s algorithm using information from the NHS website to provide answers to questions such as: “Alexa, how do I treat a migraine?”; ‘Alexa, what are the symptoms of flu?’; and “Alexa what are the symptoms of chickenpox?”. The ROI? The Department of Health (DoH) said it would empower patients and hopefully reduce the pressure on the NHS by providing reliable information on common illnesses.

The DoH said the NHS would not be giving patients Amazon Echo devices for dispensing NHS-verified information. The Guardian newspaper reports that, by 2020, half of all searches are expected to be made via voice-assisted technology. “Under the NHS long-term plan, which aims to improve the quality of patient care and health outcomes in an effort to relieve pressure on the health service, it has committed to making more services available digitally.”

The privacy issue

Placing sensitive medical information on a digital device or network is always a sensitive issue. Further privacy concerns were raised about Alexa this year after Bloomberg News reported that Amazon employees around the world regularly listen to recordings from the company’s smart speakers as part of the development process for new services.

And Siri, Apple’s voice assistant, has long been known to have the power to record private conversations. These audio clips aren’t always just stored on a server. A number of samples are passed along to third-party, human contractors who are paid to listen to them.

While Apple trades on the assertion that high-level security comes included with its products’ high prices, it has always been clear that by using Siri, or any voice assistant, the user must allow their phone to record and analyse their voice. It’s also known that Google Assistant records and stores audio, but there is at least an option to automatically delete your data every couple of months. Amazon’s Alexa stores queries until the user manually deletes them, and both Amazon and Google employ contractors to review a small number of their recordings. Cortana collects voice data “in order to improve its service”, and Samsung’s Bixby does the same.

Voice is the future?

Voice is often described as the future of computing, including voice assistants, voice-recognition technology, ambient computing and the widespread use of smart speakers in the home. But voice is also the future of surveillance: earlier this year the US National Security Agency’s voice-recognition systems, including a project called Voice RT (“Voice in Real Time”), was reported as having the stated ambition of identifying the ‘voiceprint’ of any living person.

So, for developer of digital signage, the technical issue of developing near-field microphones, the language issue and even the privacy problem might be as nothing when compared to public acceptability of solutions that listen, record and advise confidential matters!

Signagelive and AOPEN showcase voice-activated digital signage

Signagelive and AOPEN showcase voice activated digital at Google’s Munich office featuring Google Home for the Google Chrome for Work Summit.

Signagelive and AOPEN showcased voice activated digital signage as long ago as 2017. The showcase was presented at Google’s Munich office featuring Google Home, and the audience was the Google Chrome for Work Summit.

This is an annual event that Google hold for EMEA based end users and resellers to update them on the latest technologies from Google and vendor partners. Signagelive has supported Google with this event in the past and this year was no different.

With the help of AOPEN, Signagelive showcased the Web Triggers feature to instantly change content running on devices. Using Google Home the user can start a conversation with a bot and control the content displayed on a Signagelive managed smart display or player. In this case, the new Chromebox Mini from AOPEN was used.

Signagelive is a cloud-based digital signage platform provider which is purchased through distribution and reseller channels. Customers are supported at no additional cost. You can turn your Chrome device into an interactive kiosk with content interaction and management tools provided by Signagelive. Using Google Home the user can start a conversation with our bot and control the content displayed on a Signagelive managed smart display or player, a Google Chromebox, for instance, however, this would work equally well with any of our supported media players.

Applications are many and varied from a business assistant through to a retail assistant where products and availability can be shown just by asking for what you are looking for.

To find out more contact sales@signagelive.com

TrouDigital: voice-activated digital signage pioneers

“A clear message has echoed around the technology world in the past 12 months: voice search changes everything. We couldn’t agree more,” TrouDigital blog written by Lee Gannon.

“One thing seems certain to us. Voice-activated digital signage won’t be a passing novelty. It’s not a gimmick you will see once demonstrated at a trade show and hardly again beyond that.”

“The whole technology is set to move in this direction. In the near future, audiences and customers will go up to any digital display, start talking to it and expect interaction.”

“At the moment, talking to Alexa or Siri might still feel like a very private thing, too socially awkward for public. But this anxiety barrier is quickly coming down. In homes around the world, users are becoming more comfortable talking to their devices.!

“It’s only a matter of time until the majority of people will be comfortable speaking to their tech at work and out in public. By 2020, 50% of all searches are predicted to be voice. A fundamental mentality change is imminent, a new type of consumer behaviour is emerging that all businesses need to seriously anticipate.”

“That brings us to voice-activated digital signage. Our developers at TrouDigital have been working on an integration between our signage platform and the Google Assistant. We’ve seen how popular interactive signage is, whether it’s through touch or something like our ‘lift and learn’ feature. The only challenge with these solutions, however, has often been cost. Touch screen monitors still have a significant price premium, and ‘lift and learn’ displays require additional materials and ultimately space.”

“One of the reasons we are so excited about voice-activated signage is it’s accessibility to all. The Google Home and Amazon Echo are taking over in part because they are affordably priced to consumers. With our own voice-activated digital signage, we intend to follow this trend. We want to see even the smallest of schools and local businesses welcoming this innovation as this is where it can have the largest impact.”

Voice-activation is on the cusp of widespread adoption in digital signage, and while there are undoubtedly technical issues to be resolved, these might be trivial in comparison to the social and political challenges. AV News reports.

“At least initially, we will be introducing a set number of voice-triggers that allow users to interact with content on their screens. We believe the introduction of voice-triggers will have a huge influence on the type of content people are creating too. On a screen up in a reception area, a welcome message might be designed for visitors that can be activated with the command “Welcome”. For employee-facing screens, content such as training videos or sale targets might equally be triggered with bespoke commands: “Play the new employee training video” and “Show this week’s sales targets”. There are endless applications for wayfinding screens, e.g. “How do I get to _______”, retail with product information and educational content for schools.”

“Ultimately, voice-activated digital signage will invite different industries and users to really tailor the solution to their needs. Whole new uses for digital signage are likely to emerge, taking the technology in interesting new directions. At TrouDigital we are committed to the principle that voice brings real utilitarian value to users rather than being merely a ‘nice to have’ add-on feature.”

“The way this is going to be achieved is through conversation with our users, across different sectors. Our voice-triggered solution will be an evolving project, constantly in a state of refinement.”

Email mario@troudigital.com for more information.

MATRIX Voice from AdMobilize

MATRIX Voice is the latest advancement from AdMobilize, the artificial intelligence and computer vision company. “Put simply, the company that introduced AI-powered audience analytics to the digital signage industry is now bringing voice recognition functionality to both manufacturers and systems integrators, alike, through its MATRIX product line,” explains AdMobilize co-founder and CEO Rodolfo Saccoman.

“We believe that voice engagement technologies will make digital signage a more compelling and sticky communications solution for an even broader range of vertical markets.  The combination of audience analytics and voice recognition functionality truly represents the next chapter in this constantly evolving industry — and AdMobilize is at the forefront of making this chapter a reality.”

Available for only $55.00, MATRIX Voice will integrate with any voice recognition service (Amazon Alexa, Google Assistant or any other third-party service) at any time.  “MATRIX Voice puts the power of flexibility directly in the hands of the manufacturer and systems integrator, freeing them from being confined to any one of the currently available voice services. This enables plug-and-play custom voice solutions to go mainstream,” Saccoman emphasized.

Equally important, according to Saccoman, is the complete security that MATRIX Voice provides. For retailers who do not want Amazon listening to, recording and storing all store information on its cloud, MATRIX Voice is the ideal solution as it provides the capability to process voice recognition at the edge and not strictly in the cloud.

MATRIX Voice can be run on a Raspberry Pi or standalone, thanks to an optional module (ESP32) that equips it with a micro-controller as well as Wi-Fi and Bluetooth connectivity.  MATRIX Voice has an 8 microphone array, a 3.5mm audio output jack, 2 speaker outputs, a 3 A audio amplifier, 24 expansion GPIO ports, 64 MB of RAM, 64 MB of Flash, and the second tier Spartan 6 FPGA that allows manufacturers and integrators to customize the functions of the GPIO pins as well as implement one’s own audio and voice algorithms.

From a software perspective, the company has created 3 library layers to program the MATRIX Voice.  The first layer, HAL, allows integrators to program it in C++, providing the closest access to the hardware.  The second layer, MATRIX Core, contains protocol buffers and ZeroMQ that enable designers to program the MATRIX Voice in over 40 languages for complete interoperability with any pre-existing code base.  Finally, the third and highest layer, MOS (MATRIX Open System), empowers users to easily and quickly program the MATRIX Voice in JavaScript with as little as 1 line of code as well as take advantage of the company’s remade infrastructure which includes dashboard, remote deployment through its CLI tool, simple communication between devices (crosstalk), and much more.

“Let’s say you have 50 digital screen directories already installed within a mall,” Saccoman explained. “With MATRIX Voice you can now place it within the top or bottom of the directory (depending on space and optimal audio clarity for the microphones) and then use a 3rd party voice recognition service to

AdMobilize co-founder and CEO Rodolfo Saccoman: “With MATRIX Voice you can now place it within the top or bottom of the directory (depending on space and optimal audio clarity for the microphones) and then use a 3rd party voice recognition service to create custom ‘wake’ words and responses to enable the shopper to ask, ‘Directory, where is Macy’s?'”

create custom ‘wake’ words and responses to enable the shopper to ask, ‘Directory, where is Macy’s?’ and have the directory respond with the route to Macy’s from the shoppers location.  All sorts of voice activated cues and information can now be programmed into a digital signage network, opening up the door to a whole new series of applications.”

AdMobilize continues to bridge the gap between the digital revolution and disruptive computer vision analytics, ultimately infusing machine learning into formerly ‘unintelligent’ displays. Saccoman concludes that the era of “touchless technologies” will power the renaissance of the digital signage industry.

Related Posts