I did a detailed privacy check of the app TikTok and its corresponding website. Multiple law infringements, trust, transparency and data protection breaches were found.
I provide all technical and legal details in this article. For a less technical view, read the article at Süddeutsche Zeitung (in german).
I used mitmproxy as my setup in order to re-route all app traffic for analysis. One can see in the video how the device information, usage time and list of watched videos are being sent to Appsflyer and Facebook.
It is hard to believe that this is covered by „legitimate interest“ and transparency: the search terms that I entered are being forwarded to Facebook:
The transfers to the two companies are clearly conflicting with the GDPR:
Facebook cannot comply with article 14 regarding the rights to deletion of information etc. for this data.
The data transfer to Appsflyer also lacks transparency as it is unknown to which of its more than 4500 partners the data might get transferred down the line. Bytedance’s answer to this: „We won’t show you the contracts.“ Did they even read article 26 of the GDPR?
Most importantly, fundamental rights are being violated since Personally Identifying Information (PII) is transferred to a server under the control of a company residing in an unsecure, non-european country. The location of the server is irrelevant – what is important is the location of the company deciding about the data, according to Malte Engeler. Bytedance’s headquarter is located in Beijing, China.
I also checked the website itself, which is important since all videos that are being shared via messenger or social media are getting viewed thereon. Any shortened URL for a video (like vm.tiktok.com/9uTpDV) gets resolved to an URL containing the installation ID. Thereby, TikTok is able to check who shared which video.
They also track who is watching the video. Besides conventional trackers (Google Analytics), the highly controversial method of device fingerprinting is performed to assign a unique hash value for a cookie variable named s_v_webid. This is being achieved by combining unique hardware and browser characteristics.
One of them: Canvas Fingerprinting. An image is being drawn in the background, using vector graphic commands. The image then gets rasterized to a PNG image, which in return gets saved. The so-created data is quite unique among different devices, and depends on diverse settings and features of the hardware used.
Audio fingerprinting is also being used to identify visitors. This doesn’t mean that microphone or speakers of the device are being used. Instead, a sound is generated internally and its bitstream is getting recorded. This will also generate different results, depending on the device being used. This is what it sounds like:
Bytedance states that these fingerprinting techniques are being used to identify malicious browser behaviour. Quite hard to believe, as the website still works as expected even if the corresponding script is being blocked. Furthermore, Akamai’s own server-side fingerprinting technology is equally being used (which is a complete different story waiting to get investigated).
There are several other issues, like Google Analytics being used without anonymizing the IP data. And to top this off, free software is being used without attributing proper licensing – Zepto.js from Thomas Fuchs, Murmur Hash from Austin Appleby and FingerprintJS from Valentin Vasilyev, just to name a few. How low can you go?
Those are however just PRIVACY-related problems of TikTok. Just a week ago, Netzpolitik published some detailed information about their CENSORSHIP-related problems. Read up on this in these three related articles, starting with https://netzpolitik.org/2019/discrimination-tiktok-curbed-reach-for-people-with-disabilities/
So is it a good idea if the german news magazine Tagesschau fosters TikTok’s ecosystem by publishing their news clips, which are getting paid for by germany’s citizens through the means of an obligatory and nation-wide broadcasting fee?
TikTok channel operators may also fall under joint controllership with TikTok, as the ECJ has ruled for Facebook fanpages. As a consequence, a channel on TikTok could be locked down if privacy rights are being infringed. Heiko Neuhoff, the DPO of the public broadcaster NDR told me, he is about to decide if this is applicable to the channel of Tagesschau.
TikTok is breaching the law in several ways whilst exploiting the data of its mainly teenage users. This should get addressed immediately in a swift and rigorous manner. The required legislation for this is in place. Don’t let them get away by breaking society, just as 10 years of Facebook did. Journalists should find a better place for their vertical video clips to get published.
(I first published this on Twitter/Mastodon and then later transferred it with minor corrections to this blog post. Special thanks to multimedial.de for corrections regarding my english.)
I’ve another experience with tiktok. My son (ten years old) has a phone. Not a smartphone, a phone like a nokia 3310. He can access to a computer, with windows family program, and i’ve a week report with all website visited etc.
One day, he call me: Dad i’ve a text message, i don’t understand.
It’s from tiktok, they ask him to create an account. I ask him to delete this message.
But how did they get his phone number?
I did the mistake to ask to tiktok to delete personnal data. I should have ask them first to provide me data and consents to see how they collected the phone number.
The dpo answered me that it was my son who did a request to connect. That’s impossible.
I think that’s probably a collect from contact list of a friend of him…
You guess is most likely correct. Mobile apps developed by some Chinese companies ask permissions to get user’s contact information and send out spams without user’s acknowledge.
„i’ve a week report with all website visited“
What if your son is using privacy-tools like tor or a vpn? a-ha!
Das gleiche ist mir passiert!
Gute und wichtige Arbeit mit der Datenanalyse von Tiktok. Aber das Englisch in diesem Blogpost ist leider stark verbesserungswürdig… Gerne helfe ich aus, wenn meine Hilfe erwünscht ist.
„Stark verbesserungswürdig“ ist das Englisch des Autors keinesfalls. Es ist (von einigen unkritischen Grammatikfehlern abgesehen) in Ordnung. Die Kernaussagen werden allesamt für Leser mit entsprechendem technischen Hintergrund leicht verständlich vermittelt.
Das Englisch ist fast perfekt, kaum verbesserungswürdig. Ich kann fließend Englisch und hab nur von der Erwähnung von Tagesschau bemerkt, dass der Autor deutsch ist.
I agree. Apart from a couple of adjectives instead of adverbs (something native English speakers do as well), the post is very well written. I couldn’t do that well in German (which is why I’m commenting in English, even though I do speak German) 😉
And kudos for graciously accepting help Mathias!
Hallo, ja, ist leider gut möglich, dass das nicht besonders gut ist. Würde mich sehr über Korrekturen freuen. Gerne per Mail oder hier per Kommentar. Danke!
„Aber das Englisch in diesem Blogpost ist leider stark verbesserungswürdig…“
I would disagree with this statement. I is very decently written, clear to understand and brings the point across effectively in my opinion.
Every written article can be criticized and refined to the point it reaches Shakespeare, but that is not the focus of most articles.
There is no need to change or improve, or criticize anything about this English on this Blogpost really.
The offer to help to improve the writing is very kind and considerate however.
Thats why i dont use tik tok
I don’t know what you guys smoke. Who gives a shit about this
Another monkey of surveillance capitalism
It’s so cute when cattle think their opinions mean something.
Why are you trolling on an article about tiktok tracking get laid
I do. Unlike you I can actually imagine this being a threat. You don’t give a shit. Average chicken doesn’t give a shit either and soon ends on a plate.
Did you report this to the proper body of government in EU? If not, you should immediately. Send them or tag them on Twitter to anyone that is connected to privacy bodies like GDPR etc. There needs to be taken action.
As a journalist I follow the journalism ethics rule not to reach in any investigated material to a authority. This way all people I speak to don’t have to worry that material will be passed to authorities. Also I stay independent this way. Only exception is if I’m affected as private person. But of course I encourage any reader to do so. You can reach your local data protection authority (they pass it on) or the french one which is directly responsible.
why are you?
I have started a group of people who believe that we cannot wait for governments anymore, and instead must take action and give power to the people to stop stuff like this from happening.
I would love.to have a quick chat with any of you.
We have some good ideas for democratization of data, since its collection cannot be stopped. Maybe we need to level the playing field
Can you tell us what you propose?
Great work. It’s important and appreciated!
I will find your donate button. This is real journalism.
Greetings from Chicago
we must do something about this soon but how?
If you open up TikTok app in Japan, the content is all about high school or middle school girls dancing in a skimpy dress or school uniform. They obviously encourage them to do so. All those practices they are doing wrong for the privacy is one thing but how they monetize the content to begin with is entirely gross as well.
Their alogorithm shows dances like that alot less now, they have only rcently started monitizing content
Does article 14 cover metadata? If not, it makes perfect sense why Facebook is able to collect such info, which sucks
I really don’t know why you would use such a service.
Great article! You have mentioned that „Any shortened URL for a video (like vm.tiktok.com/9uTpDV) gets resolved to an URL containing the installation ID. Thereby, TikTok is able to check who shared which video.“ However, hashing is a one-way process i.e. from the shortened URL, one cannot know the input URL and installation ID. Isn’t it? If so, how would TikTok get the installation ID from the shortened URL? One way to do this would maintaining a hash of every device ID with every video on the planet. The memory requirement would be so much that their servers will crash. So, I’m not sure how you can claim that „TikTok is able to check who shared which video.“ Can you please share your thoughts?
Hi, thanks for the praise. The shortened URL has to be resolved, otherwise Tiktok couldn’t deliver the correct video belonging to it. This is done server side by a database table and not by decrypting in the browser, as you correctly observed. So there definitely exists a database on the server with the ability to map every shortened URL (which is build exclusively for every share activity by a user, not for every video!) to a long form in split seconds. No magic here, Youtube also has to resolve its video ID strings in the URLs server side to every video file ever uploaded. After that, normal web server log technology could be used to track the ID in the longform url, which now appears as a second request in the log files (after the redirect). As we know from facebook and it’s hadoop log processing the industry isn’t doing tracking analysis always for all views ever occurred but rather only for a relevant time span. Probably they aren’t processing it at all until an advertising interest (or other interest?) is occurring and then they run a confined log query. But this details are just my guess, I neither have insight to tiktoks database nor today’s server side database technology.
Have you done any comparative analysis with apps like Facebook, Instagram, Snapchat and the likes? What’s more meaningful to look at is if TikTok is worse than its competitors in terms of these privacy breaches you found.
No, unfortunately not. It’s hard to judge anyway what is worse, because everthing is interconnected. At the end the whole app system is the problem. But it’s important to understand that Facebook and Instagram are not yet really comparable. They are in another position, as they don’t have to share data. Facebook is a monopoly and not dependent on sharing data. So much of the tracking and analysis can be server side, as they control the whole ad chain from behaviour analytics over ID-serving and data analysis to finally showing ads. Tiktok is just in the process of building all of this.
Hello I seem to be running in circles for weeks trying to decipher the crap I’ve got on my android phones… I’m not a programmer, but have access to elevated privileged system processes… And I’m worried that all these packages on my mobile phone, core Google apps, Gmail, Youtube, browser, location services and emergency services for trackers , even my CAR! Am not sure yet how to determine exactly what triggers a obfuscated slew of calls and services that clearly are logging „netstats“ as well as my lists of packages, dictionary, contact lists basically everything. Even photos
This is way over my head but maybe I’m just being ultra paranoid, although I think not.
Interaction Process Endpoint Service
Internal Foreground Service
Gateway. Tiktok (something like that)
GMS and Google Chimera services, SQL server even. Any advice? Other than pray ?…
By the way even that I didn’t allow tiktok to access phone numbers it asked me if I want to follow my friends tiktok that I chat with on Facebook massenger and even that I never searched for cyber security on tiktok it always display cyber security ads and the only place that I use to see cyber security stuff is twitter in other words tiktok spies on other apps on the phone 🙂