By Wilson Chua
BASS values the privacy of our volunteers. We minimize re-identification risk by avoiding known sensitive information. These include email addresses, phone numbers, IMSI and social network accounts. Imagine our surprise (and thanks) to Prof Rommel Feria and 18 students of UP CS173 (Trends in Mobile App development) for raising the issue of IMEI. This was part of their exhaustive code review suggested by Grace Mirandilla Santos (also our thanks).
Professor Rommel Feria considers the IMEI as “just too private”. However, he hastens to add “Ookla collects the IMEI as well”. IMEI stands for “International Mobile Equipment Identity number”. Hackers need to have both the IMEI AND the IMSI to be able to clone phones.
Why store the IMEI?
We use it to verify the authenticity of user submissions. If a user entry from an Apple iPhone 7 comes with IMEI codes for a Samsung Galaxy, we will tag this record as “suspect”. Also, we use the IMEI to analyze the number of unique handsets by Manufacturer and brand. This enables us to generate interesting insights.
Here is a sample of what we can do with IMEI data. The interactive chart shows the number of volunteers using [Samsung] brand cellphones. The chart displays the number of unique handsets per Brand too. Also, you can view the number of tests, and the average bandwidth. We can also track the 95th percentile of the signal strength for each given brand and model of handset. The IMEI does not provide any re-identification risk at all at this level of aggregation.
But what if hackers breach the origin dataset? Would the IMEI information prove to be valuable to hackers? We believe so. To be on the safe side, we consulted Dondi Mapa, National Privacy Commission Deputy commissioner.
Can he help us balance the need for privacy versus the need for accuracy? After several minutes of discussion, Dondi comes up with a viable alternative.
We don’t need the full 15 digits of the IMEI. The IMEI is made up of 3 parts. The first 8 digits is the TAC (Type allocation codes) these are given for phone manufacturers to use.
The next 6 digits are the unique numbers that can identify the particular phone. It is followed by 1 character digit that serves as a check sum. We truncate the 2nd part and keep the first and the last part.
So even if the data were breached (God forbid), the IMEI data that we store there is useless for cloning. This arrangement still allows us to use the truncated IMEI for data verification. It also allows us to do intermediate level of analysis. We just lose a little bit of detail. But considering the pro and cons, this is a good tradeoff. As added security, even the truncated IMEI is excluded from the publicly downloadable set. We hope this case can be a valuable learning experience for other data custodians.
Plug for Digital Entrepreneurs:
I will be at GOAB in Palawan next week. Hope to see budding entrepreneurs and chat with you there.