CI GApp

Email and other communication platforms are deeply embedded in the fabric of our daily life. In the age of information overload it is becoming increasing hard to keep track of all the information flows while making sure that contextual integrity of the transmitted information is maintained. Notably, the Hilary Clinton email scandal has shown that often senior figures in the government don’t really understand the technology and its implication which can lead to information leakage. But you don’t need to a high profile case to appreciate this point, we all had “Reply-All” incidents and forwarding need to know bits from previous conversations.

To tackle the problem we leverage the theory of Contextual Integrity (CI) as a framework for capturing a relevant contextual flow. In terms of CI, information leaks are a result of altered flows which don’t follow the established information-contextual norms. CI offers a simple template comprising elements such as sender, recipients, information attributes, subject and transmission principles to construct and subsequently reason about different informational flows. It is worth noting that the theory of contextual integrity does not offer an explicit solution to mitigate the problem, rather it serves to advise where the source of the controversy lies by identifying information flows deviating from the established norms. More so, the evolution of norms, as argued by the CI theory, requires a comprehensives analysis that includes looking at deviating from the established/entrenched norms information flows in terms of their aims, purposes and values.

The CI decision heuristic recommends assessing [alternative flows’] respective merits as a function of the of their meaning and significance in relation to the aims, purposes, and vbalues of the context…

– Privacy in Context: Technology, Policy, and the Integrity of Social Life By Helen Nissenbaum

Nevertheless, the CI framework, allows for an relatively straightforward way to capture existing and hypothetical information flows by analysing the legal policies and generating a space of possible information flows in a given context with respect to senders, receivers, attributes, subjects and transmission principles.

To showcase the practical implementation of the CI theory, we are developing a browser extension for Gmail. The add-on that will assist users with retaining the integrity of the information flows generated as a result of their emails. Gmail is widely adopted email service, used by millions and source to a variety of contextual-information flows.

CI & Emails

Our initial system prototype will be designed to leverage our CI based privacy framework to minimize the chances of information leakage incidents. The system will warn and in some cases prevent users from accidentally generating information flows that might violate any of the expected privacy norms.

Screenshot from the current prototype - the extension was able to identifying contexts (Personal, Educational) based on emails of the recipients and the text of the email, roles of recipients in a particular context, etc.

The User Interface (UI) in traditional email clients lends itself well onto the CI framework, namely, Senders and Recipients are explicitly specified and form distinct and explicit information flows among themselves. The rest of the CI elements are inferred by the system. As depicted in a screenshot in Figure 1, our system takes full control over the compose window. To leverage the CI framework to foresee potential privacy violations, it will extract relevant CI elements from the emails.

In the backend, our system will use a combination of NLP and ML techniques to determine what type of information is disseminated (attributes), across what contexts, under which conditions (transition principles) and to whom (recipients). The resulting flow is checked against the available privacy logic, which is derived from existing privacy policies and/or other regulations.

In addition, our extension also monitors the information flows across all emails in the thread. In other words, it captures all contexts, senders, recipients, attributes, etc in all the messages of the thread. This allows the backend of our system to perform complex tests using formal verification techniques. For example, if John writes an email to Ben, where he discusses some personal information about Jen, who, according to the privacy policy has to give her consent to it. The extension will check whether an email from Jen exists in the conversation permitting the exchange of that particular information. The extension will notify user of any detected violations before the email is sent.