Attacking Deep Learning-Based NLP Systems with Malicious Word Embeddings

Presented at BSidesSF 2019, March 3, 2019, 2:10 p.m. (30 minutes).

Recent Deep Learning-based Natural Language Processing (NLP) systems rely heavily on Word Embeddings, a.k.a. Word Vectors, a method of converting words into meaningful vectors of numbers. However, the process of gathering data, training word embeddings, and incorporating them into an NLP system has received little scrutiny from a security perspective. In this talk we demonstrate that we can influence such systems by manipulating training data and how we can inject them into real-world systems.


Presenters:

  • Toshiro Nishimura
    Toshiro is an independent software engineer and entrepreneur with a passion for security and privacy. Previously he has worked in email security and bioinformatics. His regular expression and Vim skills are second to none.

Links:

Similar Presentations: