SPSC Webinar & Café Sessions

Smart speakers and virtual assistants are part of our daily life. Their applications find not only use in our homes, where speech technology grants an unprecedented level of convenience, but also in health care, forensic sciences as well as in banking and payment methods; speech technology has dual use applications. By consequence, we need to evolve our understanding of security & privacy for applications in speech communication.

SPSC features two formats.

  • Webinar — Once-a-month web seminars. The lectures range from keynote-style talks of seniors in industry and academia to practice talks of doctoral and master defenses. We try to keep the time slot for the first Monday in a month (not a bank day) at 10h Brussels time.
    Duration. 40 minute talk & up to 20 minute Q&A.
     
  • Café — On-demand brainstorming. We meet for a café session: a tea, coffee or beverage to exchange and discuss our thoughts and ideas. Senior experts pitch a topic for an interactive debate. Naturally, a café session can be quick (about 30 minutes) but might also take sometimes up to one hour (but not longer). The day of the event is set by the presenter; we try to make it either at 10h or at 16h Brussels time.
    Duration. 20 minutes talk & up to 40 minutes of discussion.

Outcome. The goal of the lecture talks is to understand another perspective and discuss on particular aspects of SPSC in its inter-disciplinary setting. We need to leave our comfort zones to meaningfully anticipate the merger of speech technology with SPSC research areas including: user-interface design, study of the law, cryptography, and cognitive sciences.

Propose a talk. Simply drop us an email with speaker, date/time, title & abstract to: cafe@lists.spsc-sig.org

Open to everyone. Including non-members (0 EUR fee). Please register for stating your data privacy consent and to obtain the session URL.

Upcoming events

Webinar: 2021-03-01 (Mon) Yefim Shulman, Tel Aviv University — 10h Brussels time [registration]
    Promised but Not Guaranteed: Understanding People's Ability to Control Their Personal Information
    Abstract [+]
The consensus in legal frameworks, such as the GDPR and CCPA, states that people (data subjects) should be able to exercise control over their personal information. Yet, having control over what happens to their personal information in practice remains a challenging endeavor for the data subjects. Based on a conceptual control theoretic analysis and select empirical findings, my talk will discuss what control over personal information may require and how it may be improved.
Webinar: 2021-04-12 (Mon) Olya Kudina, TU Delft — 10h Brussels time
    Ethical considerations of the algorithmic processing of language and speech
    Abstract [+]
In this talk, Olya will discuss the ethical implications of voice-based interfaces, such as Siri, Google Home or Alexa. She will consider them from the theoretical perspective of technologies-as-mediators and ethics-as-accompaniment to show how voice-based technologies help to shape our lives. Olya will discuss specific instances of how they foster moral perceptions, choices and values, and in parallel give a response from the creative user and design communities. Together, this will provide a clue on how to shape meaningful interactions with voice assistants, individually and collectively.
Webinar: 2021-05-03 (Mon) Clara Hollomey, Austrian Academy of Sciences — 10h Brussels time
    TBA
    Abstract [+]
TBA
Webinar: 2021-06-07 (Mon) Ingo Siegert, Otto-von-Guericke-University Magdgeburg — 10h Brussels time
    Speech Behavior Matters - Automatically Detect Device Directed Speech for the application of addressee-detection
    Abstract [+]
Voice Assistants getting more and more popular and change the way people interact. Furthermore, more people get in contact with them and it is usual to use them in the daily routine. But, unfortunately most systems are still just voice-controlled remotes and the conversations still feels uncomfortable. Especially as the conversation activation needs a wake-word, which is still error-prone. This talk firstly discusses examples about errors in the conversation initiation and depicts the state-of the art in the research field of addressee-detection with a special focus on prosodic differences in the addressee behavior. Afterwards own analyses to the addressee behavior for modern voice-assistants in two different settings: a) interactions with Amazon's Alexa in a lab setting, dataset of similar dialog complexity between HHI and HCI. Subsequently, analyses of self-reports and annotator feedback on the speaking behavior will be discussed, followed by an overview of different recognition experiments to finally build an (intelligent) addressee-detection framework based on prosodic characteristics. The talk is then concluded by mention possible future research directions and open issues in experimental conditions.
Webinar: 2021-08-02 (Mon) Andreas Nautsch, Eurecom, Sophia Antipolis — 10h Brussels time
    Metrics in VoicePrivacy and ASVspoof Challenges
    Abstract [+]
TBA

 

Past events

Webinar: 2021-02-01 (Mon) Lara Gauder & Leonardo Pepino, University of Buenos Aires — 16h Brussels time [slides, video (talk only)]
    A Study on the Manifestation of Trust in Speech
    Abstract [+]
Research has shown that trust is an essential aspect of human-computer interaction, determining the degree to which the person is willing to use a system. Predicting the level of trust that a user has on the skills of a certain system could be used to attempt to correct potential distrust by having the system take relevant measures like, for example, explaining its actions more thoroughly. In our research project, we have explored the feasibility of automatically detecting the level of trust that a user has on a virtual assistant (VA) based on their speech. For this purpose, we designed a protocol for collecting speech data, consisting of an interactive session where the subject is asked to respond to a series of factual questions with the help of a virtual assistant, which they were led to believe was either very reliable or unreliable. We collected a speech corpus in Argentine Spanish and found that the reported level of trust was effectively elicited by the protocol. Preliminary results using random forest classifiers showed that the subject’s speech can be used to detect which type of VA they were using with an accuracy up to 76%, compared to a random baseline of 50%.
Webinar: 2021-01-11 (Mon) Tom Bäckström, Aalto University — 10h Brussels time [slides, video (talk only)]
    Code of Conduct for Data Management in Speech Research - Starting the process
    Abstract [+]
For anyone working with speech it should, by now, be obvious that we need to take care of the rights of the people involved. The question is only, how do we do that? Data management in an ethically responsible manner is one the aspect of this issue and it touches all researchers; I have myself struggled many times with designing data management plans and I have received many requests that the SPSC SIG should take a shot at this. Therefore I think we need community-wide guidelines of how to handle data management, especially with respect to privacy. For example, what is an acceptable level of anonymization of data? When do we need to anonymize? When do we need to limit access to data by, say, requiring signature of a contract? What kind of expiry dates should data have? To which extent should data use be checked in the review processes of conferences, journals and grant applications? And so on. The objective of this session is to setup the process through which we create a code of conduct. That is, my intention is only to discuss how we want to make decisions about the code of conduct and not to even attempt at writing anything for a first draft. In this session I'll thus present an outline of a roadmap of how we can create a code of conduct. The session will consists of a shorter pre-recorded presentation part, where I present my initial draft of the process. After the presentation, I invite everyone to join a discussion with webcameras on. The session is not recorded, but I'll share my notes with the participants.
Webinar: 2020-12-07 (Mon) Birgit Brüggemeier, Fraunhofer Institute for Integrated Circuits IIS — 10h Brussels time [slides, video (talk only)]
    Conversational Privacy – Communicating Privacy and Security in Conversational User Interfaces
    Abstract [+]
In 2019, media scandals raised awareness about privacy and security violations in Conversational User Interfaces (CUI) like Alexa, Siri and Google. Users report that they perceive CUI as “creepy” and that they are concerned about their privacy. The General Data Protection Regulation (GDPR) gives users the right to control processing of their data, for example by opting-out or requesting deletion and it gives them the right to obtain information about their data. Furthermore, GDPR advises for seamless communication of user rights, which, currently, is poorly implemented in CUI. This talk presents a data collection interface, called Chatbot Language (CBL) that we use to investigate how privacy and security can be communicated in a dialogue between user and machine. We find that conversational privacy can affect user perceptions of privacy and security positively. Moreover, user choices suggest that users are interested in obtaining information on their privacy and security in dialogue form. We discuss implications and limitations of this research.
Webinar: 2020-11-02 (Mon) Rainer Martin & Alexandru Nelus, Ruhr-Universität Bochum — 10h Brussels time [slides, video (talk only)]
    Privacy-preserving Feature Extraction and Classification in Acoustic Sensor Networks
    Abstract [+]
In this talk we present a brief introduction to acoustic sensor networks and to feature extraction schemes that aim to improve the privacy vs. utility trade-off for audio classification in acoustic sensor networks. Our privacy enhancement approach consists of neural-network-based feature extraction models which aim to minimize undesired extraneous information in the feature set. To this end, we present adversarial, siamese and variational information feature extraction schemes in conjunction with neural-network-based classification (trust) and attacker (threat) models. We consider and compare schemes with explicit knowledge of the threat model and without such knowledge. For the latter, we analyze and apply the variational information approach in a smart-home scenario. It is demonstrated that the proposed privacy-preserving feature representation generalizes well to variations in dataset size and scenario complexity while successfully countering speaker identification attacks.
Webinar: 2020-10-05 (Mon) Nick Gaubitch, Pindrop — 10h Brussels time [video (talk only)]
    Voice Security and Why We Should Care
    Abstract [+]
After a couple of decades of somewhat slow development, voice technologies have once again gained a momentum. Much of this has been driven by large leaps in speech and speaker recognition performance and consequently, the development of many voice interfaces. Some notably successful examples of current applications of voice are the Amazon Echo and Apple Siri but we also see and increasing number of institutions that make use of voice recognition to replace more traditional customer identification methods. While much of this development is exciting for speech and audio processing research, it also creates new and significant challenges in security and privacy. Furthermore, new technologies for various forms of voice modification and synthesis are on the rise which only exacerbates the problem.

In this talk we will first introduce Pindrop and the company’s mission in the world of voice security and we will take a glimpse into the global fraud landscape of call centres, which motivates the work that we do. Next, we will take a deeper dive into the specific topic of voice modification and some related research results. Finally, we will provide an outlook into the future of voice and voice security.
Webinar: 2020-09-07 (Mon) Pablo Pérez Zarazaga, Aalto University — 10h Brussels time [slides, video (talk only)]
    Acoustic Fingerprints for Access Management in Ad-Hoc Sensor Networks
    Abstract [+]
Voice user interfaces can offer intuitive interaction with our devices, but the usability and audio quality could be further improved if multiple devices could collaborate to provide a distributed voice user interface. To ensure that users' voices are not shared with unauthorized devices, it is however necessary to design an access management system that adapts to the users' needs. Prior work has demonstrated that a combination of audio fingerprinting and fuzzy cryptography yields a robust pairing of devices without sharing the information that they record. However, the robustness of these systems is partially based on the extensive duration of the recordings that are required to obtain the fingerprint. This paper analyzes methods for robust generation of acoustic fingerprints in short periods of time to enable the responsive pairing of devices according to changes in the acoustic scenery and can be integrated into other typical speech processing tools.
Café: 2020-08-27 (Thu) Catherine Jasserand, Rijksuniversiteit Groningen — 16h Brussels time [slides]
    What is speech/voice from a data privacy perspective: Insights from the GDPR
    Abstract [+]
Catherine Jasserand, a postdoctoral researcher on privacy issues raised by biometric technologies, will discuss the notions of speech and voice from a data privacy perspective. If the GDPR mentions neither voice data nor speech data among the examples of personal data, it applies to both types of data when they relate to an identifiable or identified individual. The talk will be the opportunity to explain terminological issues (including what ‘identification’ means in the context of data protection).
Webinar: 2020-08-03 (Mon) Qiongxiu Li, Aalborg Universitet — 10h Brussels time [slides, video (talk only)]
    Privacy-Preserving Distributed Optimization via Subspace Perturbation: A General Framework
    Abstract [+]
As the modern world becomes increasingly digitized and interconnected, distributed signal processing has proven to be effective in processing its large volume of data. However, a main challenge limiting the broad use of distributed signal processing techniques is the issue of privacy in handling sensitive data. To address this privacy issue, we propose a novel yet general subspace perturbation method for privacy-preserving distributed optimization, which allows each node to obtain the desired solution while protecting its private data. In particular, we show that the dual variables introduced in each distributed optimizer will not converge in a certain subspace determined by the graph topology. Additionally, the optimization variable is ensured to converge to the desired solution, because it is orthogonal to this non-convergent subspace. We therefore propose to insert noise in the non-convergent subspace through the dual variable such that the private data are protected, and the accuracy of the desired solution is completely unaffected. Moreover, the proposed method is shown to be secure under two widely-used adversary models: passive and eavesdropping. Furthermore, we consider several distributed optimizers such as ADMM and PDMM to demonstrate the general applicability of the proposed method. Finally, we test the performance through a set of applications. Numerical tests indicate that the proposed method is superior to existing methods in terms of several parameters like estimated accuracy, privacy level, communication cost and convergence rate. [pre-print]
Webinar: 2020-07-06 (Mon) Francisco Teixeira, INESC-ID / IST, Univ. of Lisbon — 10h Brussels time [slides, video (talk only)]
    Privacy in Health Oriented Paralinguistic and Extralinguistic Tasks
    Abstract [+]
The widespread use of cloud computing applications has created a society-wide debate on how user privacy is handled by online service providers. Regulations such as the European Union's General Data Protection Regulation (GDPR), have put forward restrictions on how such services are allowed to handle user data. The field of privacy-preserving machine learning is a response to this issue that aims to develop secure classifiers for remote prediction, where both the client's data and the server's model are kept private. This is particularly relevant in the case of speech, and concerns not only the linguistic contents, but also the paralinguistic and extralinguistic info that may be extracted from the speech signal.
In this talk we provide a brief overview of the current state-of-the-art in paralinguistic and extralinguistic tasks for a major application area in terms of privacy concerns - health, along with an introduction to cryptographic methods commonly used in privacy-preserving machine learning. These will lay the groundwork for the review of the state-of-the-art of privacy in paralinguistic and extralinguistic tasks for health applications. With this talk we hope to raise awareness to the problem of preserving privacy in this type of tasks and provide an initial background for those who aim to contribute to this topic.