Virtually any meaningful interaction occurring across the Internet requires the establishment of a user profile, which in turn requires entry of Personally Identifiable Information (PII) as a way for service providers to verify and support/track user activity. Such PII often includes a person's name, age, address, email, phone number, or demographic information, which is often associated with the IP address of the device used to access online services, all of which contribute to tailored responses from the vendor. Most users understand and accept that these distant parties will use the information to optimize their interactions; however, substantially unrelated uses and abuses of users' personal information are common.
Our talk explores the levels and depths of how online entities, and their affiliates, use and abuse our personal information. Our conclusions are based on a 12-month study tracking email, phone, SMS text, and web scraping activity for 300 false identities established at ~200 distinct organizations to determine which companies behave consistent with a consumer's interests and which companies are to blame for our culture of robocalls and spam. All of this activity is based on one-time interactions with the online entity, resulting in 16584 emails, 948 voicemails, and 753 text messages.
Beyond quantifying the amount of activity associated with these identities and building the graph of information sharing, we also analyze received content in the context of a quantitative rubric applied to published privacy policies, political and/or special interest leanings, and make an attempt to identify tangible evidence of foreign interest in the 2020 presidential election. We plan to make this dataset available for others to investigate as well.