Interview with Sara Yu, Deputy General Counsel of Alibaba Group

Copy link

Sara Yu, Deputy General Counsel of Alibaba Group

Note: Sara Yu was promoted as General Counsel of Alibaba Group recently.

general counsel

Sara Yu

Deputy General Counsel of Alibaba Group

For the internet industry, technology evolves rapidly and many legal issues arise. These problems are often strange and fresh, even to the world. How does Alibaba see and manage “data”? At this point, an in-house counsel becomes a thinker to explore the unknown and find possibilities for the enterprise.

Q: In the era of the internet, communication goes beyond language, and is based on data. Alibaba is a very big platform. Who do you think the data belong to?

Sara Yu: What makes data different from other assets? Why do many people panic when they mention data? In fact, we are facing the panic of the unknown, not the data itself.

Data, by nature, has many differences from other assets, which cannot be defined by the traditional concept of assets. For traditional assets, say, you give me your mobile phone, and then you won’t have it, right? So, there is a kind of exclusivity to traditional assets.

But data are different, if you copy one data collection to me, then it will become two portions. Your bestowal does not mean that you no longer have it. You still have one portion. Data are renewable and reproducible. The more people use data, the greater the value. There are attribute differences between data and the exclusivity of assets.

The perception of the general public needs to be updated on many concepts. For example, “data collection”; our concept for data collection might be deliberately recording data. Here, the word “record” is more accurate, whereas I have some doubts about the word “collection”. The biggest difference between the digital economy era and the traditional physical environment is that the internet can reduce the cost of data recording to almost zero.

It’s like when we walk in this room today, we actually leave footprints, but why don’t we collect and analyze them? Because it cannot produce economic value. The cost of collection is too high. Only when a criminal offence is encountered may the footprints be collected.

In the future of online communications, all actions are actually recorded in real time. The emergence of the internet has only greatly reduced the cost of recording behaviour, which I think is a particularly important feature. Hence, we may not use the word “collection” to describe it, but “record”. The tools you use will naturally record your behaviour, such as devices and networks you are using.

There is also a misunderstanding about “data flow”. Many people believe that “I gave you” is data flow, however, now there is a term called “open data”. For example, “government information disclosure”. Once the information is made public, in fact, you do not have to make it “flow”, it naturally becomes reproducible, and the replication itself is a kind of flow.

What is the idea behind the government’s “one run at most” policy in terms of receiving filing materials? It’s that people don’t have to run. Let data run. Connect the systems. The idea behind it is that data are a bridge across different groups of people, different professions and different service providers. The flow of data can be achieved, which will definitely be the trend of the future, and this is something we are more than willing to see.

When all objective facts are recorded, they actually turn into computer code 0101. When we record them in computer language, each description itself is also data. Data are actually a set of symbols and symbol systems that we use to describe the objective world. For example, when introducing myself, I will read my name, age, birth date, gender, and where I work to my mobile phone. These fields are data themselves.

Any individual is ultimately a collection of data. After this kind of information accumulates into mass data, it is not a matter of one person who peeps with his/her naked eye, but more of an issue that we use algorithms to grasp and calculate. So, in my opinion, as long as we are conducting a lawful collection and taking care of security protection, the rest of the problem of usage is managed and presented through algorithms.

Q: Can privacy be equated with data?

Sara Yu: Nowadays, there are a lot of concepts like these. Some called “data”, some called “personal information”, and some called “privacy”. As long as the others can deduct from these data fields to a specific person, then these fields might be seen as sensitive information.

Based on the level of sensitivity, people may use the word “privacy” to describe it. For example, “I” and “like eating chocolate”: the combination of the two is personal information. However, preferences themselves are fluctuant. I might have liked it three years ago, but not anymore. The data field of “like eating chocolate” circles out a group of people, not necessarily meaningful to all the individuals. Hence, privacy is a relative concept.

We tend to worry about privacy issues that big data bring. When doing something, we fear that someone will be watching us. As a matter of fact, we need to look into the future, and then look at the present. For the concerns that we have, are they part of a process, or a result? If we can all realize that in the future, technologies like cloud computing, the IoT and blockchains can connect all of us in different ways, not only connecting people together, but also connecting home appliances. Lights in my home are on from what time to what time, the energy consumption situation, etc. You will find that as long as it is relatively a public service, in order to improve the services and maintain the necessary quality and capability, imperative analysis on usage data of our users is a must. This is the inevitable trend.

The emergence of new things makes our life more convenient, whether we want it or not. It will come inexorably, at this time we can only accept it, and try to balance out the risks it brings. In the process, we will feel worried, and fear for getting hurt. In my opinion, these concerns will not come into reality, because we still have a lot of unknowns today, and we cannot stop simply because of worrying about the unknown itself. We must try to address these concerns during the process.

In the future, all enterprises will be driven by big data. High digitalization is becoming the basic feature for individual dwellings. If it is a common problem faced by the whole industry, the society and all individuals, first, we should not focus it only on internet companies, and second, the experiences from internet companies in solving these problems with technology may become the new rules of the whole market, and the new approach to solving these problems.

For more interviews with general counsel, please visit here:

The interview with Sara Yu was conducted during the 2019 Legal Tech Innovation Summit in Hangzhou on 24 November 2019.

Copy link