Creating Secure Voice-based AI Apps: Insights from Developers

Research Blog

Creating Secure Voice-based AI Apps: Insights from Developers

We delve into the often overlooked experiences of developers and discover challenges that affect the security of the whole voice-app ecosystem.

In the ever-evolving landscape of voice-activated AI systems, voice apps—referred to as “skills” for Amazon Alexa and “actions” for Google Assistant—are largely created by third-party developers. While it is natural to consider the user experience of a voice app, there is value in considering the perspective of those on the other end of the skill.

In the first study of its kind, the SAIS Project conducted a qualitative study to explore the experiences and viewpoints of voice app developers, shedding light on the intricacies of this nascent ecosystem. The study involved interviews with 30 voice app developers, both professionals and hobbyists. The discussions covered a broad spectrum of topics related to security, including the development process, data collection, privacy policies, their relationships with major platforms and monetisation strategies.

As the group with primary creative control over voice apps, content and privacy policies, the developers perspective is essential on discussions about the voice AI ecosystem. The key findings below highlight developers’ concerns and the operational hurdles they face:

Discoverability Issues: Voice apps suffer from poor discoverability compared to mobile and web apps. Naming conventions and the naturalness of calling the app to start play a crucial role in user engagement, yet many developers find these aspects challenging to optimise. In addition, developers have found that unless backed by an expensive marketing campaign, it is unlikely that users will find a voice app.

Monetisation Difficulties: Many developers struggle to monetise their voice apps. Almost all hobbyist developers had not made money from their voice apps, and were more likely to be paid in non-monetary benefits. For instance, Amazon would pay them in AWS (Amazon Web Services) credits, which often meant they did not have to pay to develop skills. Those who did generate a regular income came across a number of challenges. In-app purchases had poor usability and were commented on as frustrating and scary for the user. In-app advertising is prohibited in Google Actions and only recently allowed in Alexa Skills. And the previously lucrative programme Alexa Developer Rewards has significantly reduced their payouts to top-performing skills. For small-scale developers, this represented an incentive to pursue ‘riskier’ monetisation strategies, further enabled by platform negligence in identifying and removing skills with non-compliant or illegal privacy practices.

Interestingly, the most effective monetisation strategy identified was redirecting support requests from call centres to voice assistants, which is more efficient and cost-effective for large companies with many customers. In some cases, investing in voice SEO to ensure accurate responses from existing voice assistants, i.e. Google Assistant and Alexa, was more cost effective than developing a voice app.

Certification Inconsistencies: The certification process for third-party apps on both Alexa and Google Assistant are intended to ensure security and privacy, as well as compliance with the platforms’ policies, before apps become publicly available. However, developers viewed the certification processes as opaque, with certification wait times running from a few days to a few weeks, and finding inconsistent standards from teams in different geographical regions. Certification teams were known to reject updated skills on the basis of functionality that had previously been passed. The general view was that skills with ‘run of the mill’ functionality will proceed through the process without issues, but complicated or unusual functionality was much more likely to generate problems.

Professional developers believed that more popular Alexa skills were subject to stricter certification rules. One developer commented that publishing through a company developer account would mean additional automated and human checks compared to skills that “don’t get any usage, they don’t get any traffic pointed to them.” One possible reason for this approach from platforms is that smaller skills are unlikely to be subject to the same level of attention from academics, journalists and regulators. This reduces the cost of moderating the platform for Amazon by requiring as little human support as possible whilst still technically checking each skill that gets published.

To reduce friction during the certification process professional developers often rely on internal contacts to navigate these complexities. Changes to an apps’s interaction model require recertification, so developers were strategic in crafting flexible models to reduce the number of certification passes required. An example of this in the wild could be user commands that the voice app responds to are set in place at the certification stage, but the app responses can be changed at any time in the back-end.

Privacy, Security Concerns and Liability: At all levels developers avoided writing their own privacy policies, opting instead to use ones found online or deferring to their clients’ current policy. This backs up our previous research finding that 1,500 Alexa skills have a privacy policy used by another skill, with most having partial or broken traceability (permissions are not traceable to the data practices defined in the privacy document).

We have a situation where vendors like Amazon do not complete vigorous checks of privacy policies, and data protection authorities struggle to enforce complaints against even the bigger companies. The result is an opportunity for those who might seek to take advantage of the lack of regulator attention.

None of the participants interviewed were worried about having to face consequences for inadequate policies. They often thought that the data they collected was not personal enough to count. The information revealed to developers has already been collected by Amazon, and then this is passed to the developer where the permissions allow, so it was often felt that the liability did not lie with the developer. They often pointed to the fact that their voice app was hosted on a platform-provided service, so security was the platform’s responsibility.

The privacy, security and certification mechanisms used are almost completely invisible to users. Developers felt they had been left to their own devices when making privacy and security decisions, with a lack of guidance from platforms on how they should make decisions. Many lacked expertise and confidence to tackle these issues themselves. This leads to developers relying on others who are seen as more knowledgeable, i.e. copying privacy policies and assuming issues would be handled by their hosting provider.

Implications and Future Directions The study underscores the power platforms wield over developers and gives a clearer picture of an ecosystem that encourages negative security and privacy outcomes and poor financial viability for developers. The lack of robust support and clear guidelines from platforms exacerbates these challenges around privacy, security, and monetisation.

The findings call for better support from platform providers and more consistent certification processes. They demonstrate the need for tighter regulation to ensure platforms do not dodge responsibility for privacy and security. As voice-activated AI systems continue to grow, addressing these challenges will be crucial for fostering a healthy and innovative developer community. This will even be more important as voice-enabled AI systems get increasingly more sophisticated based on Large Language Models (LLMs) and Assistant technologies like ChatGPT.

To read the full paper, please visit the following webpage: https://kclpure.kcl.ac.uk/portal/en/publications/voice-application-developer-experiences-with-alexa-and-google-ass.


Connect with us on LinkedIn and Twitter

Our podcast Always Listening

If you would like to share any comments, or just to get in contact with the SAIS team email: sais-comms@kcl.ac.uk

SAIS is a cross-disciplinary collaboration between the departments of Informatics, Digital Humanities and The Policy Institute at King’s College London, and the Department of Computing at Imperial College London, working with non-academic partners: Microsoft, Humley, Hospify, Mycroft, policy and regulation experts, and the general public, including non-technical users.

OUTREACH
outreach