- This event has passed.
Anonymity of multi-hop neighborhoods in social networks
13 September 2022 , 10:46 – 11:06 UTC+1
Lecture by Rachel de Jong, Frank Takes and Mark van der Loo at the 6th European Conference on Social Networks (EUSN 2022).
Introduction & Goal. Sharing large-scale social network datasets is advantageous for the development of computational social science, since studying and replicating findings on such datasets is key to understanding and modeling various social phenomena [1, 2]. Following the principles of widely implemented privacy laws such as GDPR, such datasets need to be anonymous, which means that people should not be identifiable by someone with a realistic amount of background knowledge. This work focuses on a method to assess this so-called risk of disclosure, by measuring the anonymity of individuals in networks based on their structural position within the network.
Previous work has focussed on measuring anonymity using only the direct surroundings of a node [3]. However, in [4] it is shown that when a possible attacker has information about a larger neighborhood beyond these direct surroundings, this could drastically decrease the anonymity of the individual. Therefore, in this work, we present a novel approach that extends these two earlier works into a parametrized measure that can serve as a lower bound for the expected anonymity at different levels of knowledge of the attacker. On both modeled and real-world social network data, we demonstrate that if an attacker has perfect information about what we call multi-hop neighborhoods, the anonymity of individuals in the social network is severely compromised. This has serious implications for any social science researcher sharing social network data with other parties.
Approach. We measure the anonymity by partitioning the set of nodes of a given social network into equivalence classes. We define equivalence by using the measure of d–k-anonymity, where two nodes are d-equivalent if 1) their respective d-hop neighborhoods (i.e., neighborhoods up to distance d of the node) are isomorphic, and 2) there is an isomorphism mapping the two compared nodes onto each other. Next, following [3], we define a node as unique if it has no equivalent nodes in the network.
To understand anonymity of individuals in real-world networks, we measure structural anonymity in various known graph models (Erdős–Rényi (ER) and Watts Strogatz (WS)) and a range of empirical network datasets. We investigate anonymity for increasingly larger hop neighborhoods, and therewith different attacker knowledge scenarios. This improves upon [3] because we allow for larger-hop neighborhoods, and upon [4] because we assume perfect information about connectivity of individuals up to a certain distance.