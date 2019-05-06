“It’s a core part of what you need,” said Nipun Mathur, the director of product management for artificial intelligence (AI) at Facebook. “I don’t see the need going away.”

The content labelling programme could raise new privacy issues for Facebook, according to legal experts. The company is facing regulatory investigations worldwide over an unrelated set of alleged privacy abuses involving the sharing of user data with business partners.

The Wipro workers said they gain a window into lives as they view a vacation photo or a post memorialising a deceased family member. Facebook acknowledged that some posts, including screenshots and those with comments, may include user names.

The company said its legal and privacy teams must sign off on all labelling efforts, adding that it recently introduced an auditing system “to ensure that privacy expectations are being followed and parameters in place are working as expected”.

But one former Facebook privacy manager, speaking on condition of anonymity, expressed unease about users’ posts being scrutinised without their explicit permission. The EU’s year-old General Data Protection Regulation has strict rules about how companies gather and use personal data and in many cases requires specific consent.

“One of the key pieces of the regulation is purpose limitation,” said John Kennedy, a partner at law firm Wiggin and Dana who has worked on outsourcing, privacy and AI.

If the purpose is looking at posts to improve the precision of services, that should be stated explicitly, Kennedy said. Using an outside vendor for the work could also require consent, he said.

It remains unclear exactly how the regulation will be interpreted and whether regulators and consumers would see Facebook’s internal labelling practices as problematic. Europe’s top data privacy official declined to comment on possible concerns.

A Facebook spokesperson said: “We make it clear in our data policy that we use the information people provide to Facebook to improve their experience and that we might work with service providers to help in this process.”

US senator Mark Warner, a Democrat and leading critic of social media, said that large platforms increasingly are “taking more and more data from users, for wider and more far-reaching uses, without any corresponding compensation to the user”.

Warner said he is drafting legislation that would require Facebook to “disclose the value of users’ data, and tell users exactly how their data is being monetised”.

Human-powered content labelling, also referred to as “data annotation”, is a growth industry as companies seek to harness data for AI training and other purposes.

Self-driving car companies such as Alphabet’s Waymo have labellers identify traffic lights and pedestrians in videos to fortify their AI. Voice assistant developers including Amazon.com have people annotate customer audio to improve AI’s ability to decipher speech.

Facebook launched the Wipro project in April 2018 . The Indian firm received a $4m contract and formed a team of about 260 labellers, according to the workers. In 2018, the work consisted of analysing posts from the prior five years.

After completing that, the team in December was cut to about 30 and shifted to labelling each month posts from the prior month. Work is expected to last through at least the end of 2019, they said. Facebook confirmed the staffing changes but declined to comment on financial details.

The company said its analysis is ongoing so it could not provide any findings from the labelling or resulting product decisions. It has not told labellers the purpose or results of the project, and the workers said all they have inferred from their limited view is that selfies are increasingly popular.

The Wipro labellers and Facebook said the posts are a random sampling of text-based status updates, shared links, event posts, Stories feature uploads, videos and photos, including user-posted screenshots of chats on Facebook’s various messaging apps. The posts come from Facebook and Instagram users globally, in languages including English, Hindi and Arabic.

Each item goes to two labellers to check accuracy, and a third if they disagree, Facebook said. Workers said they see on average 700 items per day. Facebook said the target average is lower.

Facebook confirmed labellers in Timisoara, Romania and Manila, the Philippines are involved in the same project.