Blog

19 Mar 2018

Facebook hired a forensics firm to investigate Cambridge Analytica as stock falls 7%

Hoping to tamp down the furor that erupted over reports that its user data was improperly acquired by Cambridge Analytica, Facebook has hired the digital forensics firm, Stroz Friedberg, to perform an audit on the political consulting and marketing firm.

In a statement, Facebook said that Cambridge Analytica has agreed to comply and give Stroz Friedberg access to their servers and systems.

Facebook has also reached out to the whistleblower Christopher Wylie and Aleksandr Kogan, the Cambridge University professor who developed an application which collected data that he then sold to Cambridge Analytica.

Kogan has consented to the audit, but Wylie, who has positioned himself as one of the architects for the data collection scheme before becoming a whistleblower, declined, according to Facebook.

The move comes after a brutal day for Facebook’s stock on the Nasdaq stock exchange. Facebook shares plummeted 7%, erasing roughly $40 billion in market capitalization amid fears that the growing scandal could lead to greater regulation of the social media juggernaut.

Indeed both the Dow Jones Industrial Average and the Nasdaq fell sharply as worries over increased regulations for technology companies ricocheted around trading floors, forcing a sell-off.

“This is part of a comprehensive internal and external review that we are conducting to determine the accuracy of the claims that the Facebook data in question still exists. This is data Cambridge Analytica, SCL, Mr. Wylie, and Mr. Kogan certified to Facebook had been destroyed. If this data still exists, it would be a grave violation of Facebook’s policies and an unacceptable violation of trust and the commitments these groups made,” Facebook said in a statement.

However, as more than one Twitter user noted, this is an instance where they’re trying to close Pandora’s Box but the only thing that the company has left inside is… hope.

The bigger issue is that Facebook had known about the data leak early as two years ago, but did nothing to inform its users — because the violation was not a “breach” of Facebook’s security protocols.

Facebook’s own argument for the protections it now has in place is a sign of its too-little, too-late response to a problem it created for itself with its initial policies.

“We are moving aggressively to determine the accuracy of these claims. We remain committed to vigorously enforcing our policies to protect people’s information. We also want to be clear that today when developers create apps that ask for certain information from people, we conduct a robust review to identify potential policy violations and to assess whether the app has a legitimate use for the data,” the company said in a statement. “We actually reject a significant number of apps through this process. Kogan’s app would not be permitted access to detailed friends’ data today.”

It doesn’t take a billionaire Harvard dropout genius to know that allowing third parties to access personal data without an individual’s consent is shady. And that’s what Facebook’s policies used to allow by letting Facebook “friends” basically authorize the use of a user’s personal data for them.

As we noted when the API changes first took effect in 2015:

Apps don’t have to delete data they’ve already pulled. If someone gave your data to an app, it could go on using it. However, if you request that a developer delete your data, it has to. However, how you submit those requests could be through a form, via email, or in other ways that vary app to app. You can also always go to your App Privacy Settings and remove permissions for an app to pull more data about you in the future.

19 Mar 2018

Here’s how Uber’s self-driving cars are supposed to detect pedestrians

A self-driving vehicle made by Uber has struck and killed a pedestrian. It’s the first such incident and will certainly be scrutinized like no other autonomous vehicle interaction in the past. But on the face of it it’s hard to understand how, short of a total system failure, this could happen, when the entire car has essentially been designed around preventing exactly this situation from occurring.

Something unexpectedly entering the vehicle’s path is pretty much the first emergency event that autonomous car engineers look at. The situation could be many things — a stopped car, a deer, a pedestrian — and the systems are one and all designed to detect them as early as possible, identify them and take appropriate action. That could be slowing, stopping, swerving, anything.

Uber’s vehicles are equipped with several different imaging systems which work both ordinary duty (monitoring nearby cars, signs and lane markings) and extraordinary duty like that just described. No less than four different ones should have picked up the victim in this case.

Top-mounted lidar. The bucket-shaped item on top of these cars is a lidar, or light detection and ranging, system that produces a 3D image of the car’s surroundings multiple times per second. Using infrared laser pulses that bounce off objects and return to the sensor, lidar can detect static and moving objects in considerable detail, day or night.

This is an example of a lidar-created imagery, though not specifically what the Uber vehicle would have seen.

Heavy snow and fog can obscure a lidar’s lasers, and its accuracy decreases with range, but for anything from a few feet to a few hundred feet, it’s an invaluable imaging tool and one that is found on practically every self-driving car.

The lidar unit, if operating correctly, should have been able to make out the person in question, if they were not totally obscured, while they were still more than a hundred feet away, and passed on their presence to the “brain” that collates the imagery.

Front-mounted radar. Radar, like lidar, sends out a signal and waits for it to bounce back, but it uses radio waves instead of light. This makes it more resistant to interference, since radio can pass through snow and fog, but also lowers its resolution and changes its range profile.

Tesla’s Autopilot relies mostly on radar.

Depending on the radar unit Uber employed — likely multiple in both front and back to provide 360 degrees of coverage — the range could differ considerably. If it’s meant to complement the lidar, chances are it overlaps considerably, but is built more to identify other cars and larger obstacles.

The radar signature of a person is not nearly so recognizable, but it’s very likely they would have at least shown up, confirming what the lidar detected.

Short and long-range optical cameras. Lidar and radar are great for locating shapes, but they’re no good for reading signs, figuring out what color something is and so on. That’s a job for visible-light cameras with sophisticated computer vision algorithms running in real time on their imagery.

The cameras on the Uber vehicle watch for telltale patterns that indicate braking vehicles (sudden red lights), traffic lights, crossing pedestrians and so on. Especially on the front end of the car, multiple angles and types of camera would be used, so as to get a complete picture of the scene into which the car is driving.

Detecting people is one of the most commonly attempted computer vision problems, and the algorithms that do it have gotten quite good. “Segmenting” an image, as it’s often called, generally also involves identifying things like signs, trees, sidewalks and more.

That said, it can be hard at night. But that’s an obvious problem, the answer to which is the previous two systems, which work night and day. Even in pitch darkness, a person wearing all black would show up on lidar and radar, warning the car that it should perhaps slow and be ready to see that person in the headlights. That’s probably why a night-vision system isn’t commonly found in self-driving vehicles (I can’t be sure there isn’t one on the Uber car, but it seems unlikely).

Safety driver. It may sound cynical to refer to a person as a system, but the safety drivers in these cars are very much acting in the capacity of an all-purpose failsafe. People are very good at detecting things, even though we don’t have lasers coming out of our eyes. And our reaction times aren’t the best, but if it’s clear that the car isn’t going to respond, or has responded wrongly, a trained safety driver will react correctly.

Worth mentioning is that there is also a central computing unit that takes the input from these sources and creates its own more complete representation of the world around the car. A person may disappear behind a car in front of the system’s sensors, for instance, and no longer be visible for a second or two, but that doesn’t mean they ceased existing. This goes beyond simple object recognition and begins to bring in broader concepts of intelligence such as object permanence, predicting actions and the like.

It’s also arguably the most advanced and closely guarded part of any self-driving car system and so is kept well under wraps.

It isn’t clear what the circumstances were under which this tragedy played out, but the car was certainly equipped with technology that was intended to, and should have, detected the person and caused the car to react appropriately. Furthermore, if one system didn’t work, another should have sufficed — multiple failbacks are only practical in high-stakes matters like driving on public roads.

We’ll know more as Uber, local law enforcement, federal authorities and others investigate the accident.

19 Mar 2018

Here’s how Uber’s self-driving cars are supposed to detect pedestrians

A self-driving vehicle made by Uber has struck and killed a pedestrian. It’s the first such incident and will certainly be scrutinized like no other autonomous vehicle interaction in the past. But on the face of it it’s hard to understand how, short of a total system failure, this could happen, when the entire car has essentially been designed around preventing exactly this situation from occurring.

Something unexpectedly entering the vehicle’s path is pretty much the first emergency event that autonomous car engineers look at. The situation could be many things — a stopped car, a deer, a pedestrian — and the systems are one and all designed to detect them as early as possible, identify them and take appropriate action. That could be slowing, stopping, swerving, anything.

Uber’s vehicles are equipped with several different imaging systems which work both ordinary duty (monitoring nearby cars, signs and lane markings) and extraordinary duty like that just described. No less than four different ones should have picked up the victim in this case.

Top-mounted lidar. The bucket-shaped item on top of these cars is a lidar, or light detection and ranging, system that produces a 3D image of the car’s surroundings multiple times per second. Using infrared laser pulses that bounce off objects and return to the sensor, lidar can detect static and moving objects in considerable detail, day or night.

This is an example of a lidar-created imagery, though not specifically what the Uber vehicle would have seen.

Heavy snow and fog can obscure a lidar’s lasers, and its accuracy decreases with range, but for anything from a few feet to a few hundred feet, it’s an invaluable imaging tool and one that is found on practically every self-driving car.

The lidar unit, if operating correctly, should have been able to make out the person in question, if they were not totally obscured, while they were still more than a hundred feet away, and passed on their presence to the “brain” that collates the imagery.

Front-mounted radar. Radar, like lidar, sends out a signal and waits for it to bounce back, but it uses radio waves instead of light. This makes it more resistant to interference, since radio can pass through snow and fog, but also lowers its resolution and changes its range profile.

Tesla’s Autopilot relies mostly on radar.

Depending on the radar unit Uber employed — likely multiple in both front and back to provide 360 degrees of coverage — the range could differ considerably. If it’s meant to complement the lidar, chances are it overlaps considerably, but is built more to identify other cars and larger obstacles.

The radar signature of a person is not nearly so recognizable, but it’s very likely they would have at least shown up, confirming what the lidar detected.

Short and long-range optical cameras. Lidar and radar are great for locating shapes, but they’re no good for reading signs, figuring out what color something is and so on. That’s a job for visible-light cameras with sophisticated computer vision algorithms running in real time on their imagery.

The cameras on the Uber vehicle watch for telltale patterns that indicate braking vehicles (sudden red lights), traffic lights, crossing pedestrians and so on. Especially on the front end of the car, multiple angles and types of camera would be used, so as to get a complete picture of the scene into which the car is driving.

Detecting people is one of the most commonly attempted computer vision problems, and the algorithms that do it have gotten quite good. “Segmenting” an image, as it’s often called, generally also involves identifying things like signs, trees, sidewalks and more.

That said, it can be hard at night. But that’s an obvious problem, the answer to which is the previous two systems, which work night and day. Even in pitch darkness, a person wearing all black would show up on lidar and radar, warning the car that it should perhaps slow and be ready to see that person in the headlights. That’s probably why a night-vision system isn’t commonly found in self-driving vehicles (I can’t be sure there isn’t one on the Uber car, but it seems unlikely).

Safety driver. It may sound cynical to refer to a person as a system, but the safety drivers in these cars are very much acting in the capacity of an all-purpose failsafe. People are very good at detecting things, even though we don’t have lasers coming out of our eyes. And our reaction times aren’t the best, but if it’s clear that the car isn’t going to respond, or has responded wrongly, a trained safety driver will react correctly.

Worth mentioning is that there is also a central computing unit that takes the input from these sources and creates its own more complete representation of the world around the car. A person may disappear behind a car in front of the system’s sensors, for instance, and no longer be visible for a second or two, but that doesn’t mean they ceased existing. This goes beyond simple object recognition and begins to bring in broader concepts of intelligence such as object permanence, predicting actions and the like.

It’s also arguably the most advanced and closely guarded part of any self-driving car system and so is kept well under wraps.

It isn’t clear what the circumstances were under which this tragedy played out, but the car was certainly equipped with technology that was intended to, and should have, detected the person and caused the car to react appropriately. Furthermore, if one system didn’t work, another should have sufficed — multiple failbacks are only practical in high-stakes matters like driving on public roads.

We’ll know more as Uber, local law enforcement, federal authorities and others investigate the accident.

19 Mar 2018

IBM working on ‘world’s smallest computer’ to attach to just about everything

IBM is hard at work on the problem of ubiquitous computing, and its approach, understandably enough, is to make a computer small enough that you might mistake it for a grain of sand. Eventually these omnipresent tiny computers could help authenticate products, track medications, and more.

Look closely at the image above and you’ll see the device both on that pile of salt, and on the person’s finger. No, not that big one. Look closer:

It’s an evolution of IBM’s “crypto anchor” program, which uses a variety of methods to create what amounts to high-tech watermarks for products that verify they’re, for example, from the factory the distributor claims they are, and not counterfeits mixed in with genuine items.

The “world’s smallest computer,” as IBM continually refers to it, is meant to bring blockchain capability into this; the security advantages of blockchain-based logistics and tracking could be brought to something as benign as a bottle of wine or box of cereal.

A schematic shows the parts (you’ll want to view full size).

In addition to getting the computers extra-tiny, IBM intends to make them extra-cheap, perhaps ten cents apiece. So there’s not much of a lower limit on what types of products could be equipped with the tech.

Not only that, but the usual promises of ubiquitous computing also apply: this smart dust could be all over the place, doing little calculations, sensing conditions, connecting with other motes and the internet to allow… well, use your imagination.

It’s small (about 1mm x 1mm), but it still has the power of a complete computer, albeit not a hot new one. With a few hundred thousand transistors, a bit of RAM, a solar cell and a communications module, it has about the power of a chip from 1990. And we got a lot done on those, right?

Of course at this point it’s very much still a research project in IBM’s labs, not quite a reality; the project is being promoted as part of the company’s “five in five” predictions of turns technology will take in the next five years.

19 Mar 2018

Instagram Stories gets ‘quote tweet’-style feed post resharing

Instagram’s next big Stories feature could let you compliment or trash talk other people’s feed posts, or embed a “see post” button to promote your own. A TechCrunch reader sent us these screenshots of the new feature, which Instagram confirmed to us is appearing to a small subset of users. “We’re always testing ways to make it easier to share any moment with friends on Instagram” a spokesperson wrote. Now those moments can include dunking on people. 

Instagram has never had a true “regram” feature with the feed, just slews of unofficial and sometimes scammy apps, but this is perhaps the closest thing. Users often screenshot feed posts and share them in Stories with overlaid commentary, but this limited the cropping and commentary options. Making an official “reshare could unlock all sorts of new user behaviors, from meme curation to burn book shade throwing to social stars teasing their feed posts in their Stories. Brands might love it for using their Stories to cross-promote a big ad campaign. Employing Stories to drive extra Likes and comments to permanent posts could help them gain more visibility in Instagram’s feed ranking algorithm.

Here’s how the feed post to Instagram Stories sharing feature works. You pick any public, permanent Instagram post and tap a button to embed it in your Story. You can tap to change the design to highlight or downplay the post’s author, move and resize it within your Story post, and add commentary or imagery using Instagram’s creative tools. When people view the story, they can tap on the post embed to bring up a “see post” button which opens the permanent feed post.

Users who don’t want their posts to be “quote-Storied” can turn off the option in their settings, and only public posts can be reshared. Facebook says it doesn’t have details about a wider potential rollout beyond the small percentage of users currently with access. But given the popularity of apps like Repost For Instagram, I expect the feature to be popular and eventually open to everyone.

Quote-Storying could help keep the feed relevant as more users spend their time sharing to the little bubbles that sit above it. And it offers a powerful viral discover mechanism for creators who can now ask fans to quickly reshare their post rather than having to awkwardly screenshot and upload them.

While both Instagram and Snapchat have let people privately send other people’s posts to friends as private messages, Snapchat lacks a way to embed other Stories or Discover content in your Story. Snapchat may have pioneered the Stories format, but Instagram has been rapidly iterating with features like Super Zoom and Highlights to extend its user count lead over the app it cloned.

The move by Instagram further ties together the three parts of its app: the permanent feed, ephemeral Stories, and private Direct messaging. You can imagine someone finding a post in the feed, resharing it their story, then joking about it with friends over Direct. It’s this multi-modal social media usage that turns casual users into loyal, ad revenue-generating ‘Grammers.

19 Mar 2018

Microsoft is adding a bunch of accessibility features to Windows 10

Microsoft plans to bring a number of new features for users with visual impairment to Windows 10, the company announced in a blog post earlier today. Chief among the updates, which are due out with the next version of the desktop software, are additions to the Ease of Access setting panel.

The updated page will be grouped together by vision, hearing and interaction, which the most frequently used settings listed first. A number of new settings are being added to the page, as well, including the ability to “Make Everything Bigger” and “Make Everything Brighter.”

Narrator, the company’s screen-reading app, is being tweaked to be more responsive to keyboard input and offer more continuous control reading. The feature has also been tweaked to offer up more information like “page loading” in the Edge Browser, as well as letting users control text styles with vocal inflections. That means, instead of having to say, “start bold” to bold text, users can adjust the style with the sound of their voice.

Eye control is being improved as well, including the ability to pause eye control for reading and better navigation. Though that feature is apparently still in the beta testing stages. Speaking of, a number of the features are already in preview through Insider builds, for those who want to get an early jump on the action.

Microsoft’s also promising additional accessibility features later this year in line with a promise CEO Satya Nadella made back in 2015, to “embrace inclusion in our product design and company culture.”

19 Mar 2018

Sierra Leone government denies the role of blockchain in its recent election

The National Electoral Commission Sierra Leone has come out with a clarification – and, to , an outright condemnation – of the news that their’s was one of the first elections recorded to the blockchain. While the blockchain voting company Agora claimed to have run the first blockchain-based election, it appears that the company did little more than observe the voting and store some of the results.

“The NEC [National Electoral Commission] has not used and is not using blockchain technology in any part of the electoral process,” said NEC head Mohamed Conteh. Why he is adamant about this fact is unclear – questions I asked went unanswered – but he and his team have created a set of machine readable election results and posted the following clarification:

“Anonymized votes/ballots are being recorded on Agora’s blockchain, which will be publicly available for any interested party to review, count and validate,” said Agora’s Leonardo Gammar. “This is the first time a government election is using blockchain technology.”

In Africa the reactions were mixed. “It would be like me showing up to the UK election with my computer and saying, ‘let me enter your counting room, let me plug-in and count your results,’” said Morris Marah to RFI.

“Agora’s results for the two districts they tallied differed considerably from the official results, according to an analysis of the two sets of statistics carried out by RFI,” wrote RFI’s Daniel Finnan.

Clearly the technology is controversial, especially in election law and vote-counting. Established players are already trying mightily to avoid fraud and corruption and Agora’s claim, no matter how plausible, further muddies those waters. Was Agora simply attempting a PR stunt in support of its upcoming token sale. That’s unclear. What is clear is the disappointment in Sierra Leone regarding their efforts.

UPDATE – Sierra Leone’s electoral committee responded:

19 Mar 2018

Sierra Leone government denies the role of blockchain in its recent election

The National Electoral Commission Sierra Leone has come out with a clarification – and, to , an outright condemnation – of the news that their’s was one of the first elections recorded to the blockchain. While the blockchain voting company Agora claimed to have run the first blockchain-based election, it appears that the company did little more than observe the voting and store some of the results.

“The NEC [National Electoral Commission] has not used and is not using blockchain technology in any part of the electoral process,” said NEC head Mohamed Conteh. Why he is adamant about this fact is unclear – questions I asked went unanswered – but he and his team have created a set of machine readable election results and posted the following clarification:

“Anonymized votes/ballots are being recorded on Agora’s blockchain, which will be publicly available for any interested party to review, count and validate,” said Agora’s Leonardo Gammar. “This is the first time a government election is using blockchain technology.”

In Africa the reactions were mixed. “It would be like me showing up to the UK election with my computer and saying, ‘let me enter your counting room, let me plug-in and count your results,’” said Morris Marah to RFI.

“Agora’s results for the two districts they tallied differed considerably from the official results, according to an analysis of the two sets of statistics carried out by RFI,” wrote RFI’s Daniel Finnan.

Clearly the technology is controversial, especially in election law and vote-counting. Established players are already trying mightily to avoid fraud and corruption and Agora’s claim, no matter how plausible, further muddies those waters. Was Agora simply attempting a PR stunt in support of its upcoming token sale. That’s unclear. What is clear is the disappointment in Sierra Leone regarding their efforts.

UPDATE – Sierra Leone’s electoral committee responded:

19 Mar 2018

Meltwater has acquired DataSift to double down on social media analytics

In a week when all eyes are on Facebook and the subject of how data about us on social media platforms gets used without us knowing, there’s been some consolidation afoot in the world of media-based big data services. DataSift, the London-based company that pulls data from conversations across social, news and blog platforms, anonymises it, and then parses it for insights for third party organizations, is being acquired by Meltwater, the company originally out of Norway but now based in San Francisco that provides business intelligence services such as media monitoring and AI analytics on internal business communications.

Financial terms are not being disclosed for the deal but it includes technology, employees and DataSift’s customer base. DataSift had raised about $72 million in funding from investors that include Insight Venture Partners, Scale Ventures and Upfront Ventures and the company had never disclosed its valuation. Meltwater is bootstrapped and has never raised outside funding, but it has also been described as a “unicorn” with a billion-dollar valuation — a description that the company would not confirm but also does not contest.

DataSift’s CEO Tim Barker, who is taking on a role at Meltwater leading his team there, said that it’s business as usual for DataSift’s existing customers, while the two will also work on integrating their platforms together. Combined, the customer base includes media companies, brands and educational and other organizations that make use of the data. Disclosed customer names include Viacom, Ogilvy, Air France, Vans, Harvard Business School and Columbia Business School.

The news comes at an interesting time in the world of social media, and more specifically the data that swirls around it. Over the weekend, we saw a huge story break about how the analytics firm Cambridge Analytica was involved in what has amounted to a data scandal: an affiliate working with the firm had used an innocuous-looking research survey to in turn tap into the social graphs and the related data of respondents, by way of Facebook’s API, netting tens of millions of profiles in the process. The fallout is likely to be felt for a long time to come, and may well bring about a new kind of regulation and scrutiny over how personal data is harnessed and used in social networks.

While this is raising a lot of questions already about personal data and social media, DataSift and Meltwater, to be clear, sit at a different section of the data and media continuum. While Meltwater focuses on what’s produced either internally at a business, or by publications and other media companies, DataSift’s currency is the movement of information that’s already being put out onto social networks in public posts, rather than personal details or attributes of users. As Barker describes it, the company has taken an approach that it calls “privacy by design,” in which it works only with anonymised data to reach its insights, and that work is not focused on how to use that data to rebuild profiles or “types” that can be used to match people back up with ads or other marketing.

The idea will be to bring that together with Meltwater’s existing business to enhance it.

“By combining advances in machine learning and the vast amount of publicly available information on the internet, you can today understand and track Porter’s Five Forces,” — a framework for analysing a business’ competitors —  “in real time to understand strategic opportunities and threats for your business. Executives that take advantage of this new opportunity create an unfair information advantage over those who don’t,” said Jorn Lyseggen, CEO and founder of Meltwater in a statement.

DataSift has built a scalable platform that lets developers build data science-driven insights from social firehoses while protecting the privacy of an individual’s data. When combined with the data Meltwater captures and our AI capabilities, developers can disrupt the Business Intelligence space by either building new applications or complementing existing ones with unique signal that can be only derived from external data.”

All the same, it will be interesting to see what the affects are on businesses like these. For one, DataSift currently is built on the cooperation (and by the grace) of social networks — by way of APIs and access to firehoses of data that DataSift and companies like it use to feed their analytics engines. Whether the social media companies decide they would like to try their hands at some of that business themselves, or perhaps get told by regulators that they simply can no longer share information in this way, this leaves companies that are built on that access in a precarious position.

And that’s before you consider the effects of existing legislation like GDPR, which Meltwater says is something the company is built to handle.

DataSift’s advanced analytics platform is a great compliment to what we have in house, at a time of growing privacy concerns and regulation such as GDPR,” said Aditya Jami, Senior Director of Engineering and Head of AI at Meltwater. “DataSift’s technology will be instrumental… to deliver next generation insights.”

DataSift in its past had a very notable instance of getting cut off from one of those feeds, and feeling the strong after effects: after years of working closely together and being the main users of Twitter’s firehose of Tweets — access that was brokered when DataSift handed over to Twitter the first third-party website “retweet” button to Twitter, created when DataSift was called TweetMeme — Twitter cut off DataSift from its firehose in the wake of a move to beef up its own big-data analytics business.

DataSift eventually regrouped and now works with around 15 partners, including Facebook, LinkedIn and WordPress, but given that its original premise was based around the kind of real-time data that Twitter uniquely provides, it was a big shift for the startup.

Meltwater has had its own scuffles in the past with the third-party services it relies on to make the wheels of its business model turn. Both have moved on from those more spiky years, it seems.

Fast forward to today, the combination of Meltwater and DataSift makes some sense when you think about the evolution of media. The rise of social networking has created another playing field for businesses: they now have a new set of platforms where they can pick up chatter about themselves, and it’s become the hot new place to communicate with customers.

Whether Facebook wants to admit it or not, social networking has become the modern-day descendent of the old-school media industry, and this is one aspect of that. While DataSift was built on trying to better harness and parse chatter from the former, Meltwater was built on the back of media monitoring to essentially provide the same services on the latter.

 

19 Mar 2018

Meltwater has acquired DataSift to double down on social media analytics

In a week when all eyes are on Facebook and the subject of how data about us on social media platforms gets used without us knowing, there’s been some consolidation afoot in the world of media-based big data services. DataSift, the London-based company that pulls data from conversations across social, news and blog platforms, anonymises it, and then parses it for insights for third party organizations, is being acquired by Meltwater, the company originally out of Norway but now based in San Francisco that provides business intelligence services such as media monitoring and AI analytics on internal business communications.

Financial terms are not being disclosed for the deal but it includes technology, employees and DataSift’s customer base. DataSift had raised about $72 million in funding from investors that include Insight Venture Partners, Scale Ventures and Upfront Ventures and the company had never disclosed its valuation. Meltwater is bootstrapped and has never raised outside funding, but it has also been described as a “unicorn” with a billion-dollar valuation — a description that the company would not confirm but also does not contest.

DataSift’s CEO Tim Barker, who is taking on a role at Meltwater leading his team there, said that it’s business as usual for DataSift’s existing customers, while the two will also work on integrating their platforms together. Combined, the customer base includes media companies, brands and educational and other organizations that make use of the data. Disclosed customer names include Viacom, Ogilvy, Air France, Vans, Harvard Business School and Columbia Business School.

The news comes at an interesting time in the world of social media, and more specifically the data that swirls around it. Over the weekend, we saw a huge story break about how the analytics firm Cambridge Analytica was involved in what has amounted to a data scandal: an affiliate working with the firm had used an innocuous-looking research survey to in turn tap into the social graphs and the related data of respondents, by way of Facebook’s API, netting tens of millions of profiles in the process. The fallout is likely to be felt for a long time to come, and may well bring about a new kind of regulation and scrutiny over how personal data is harnessed and used in social networks.

While this is raising a lot of questions already about personal data and social media, DataSift and Meltwater, to be clear, sit at a different section of the data and media continuum. While Meltwater focuses on what’s produced either internally at a business, or by publications and other media companies, DataSift’s currency is the movement of information that’s already being put out onto social networks in public posts, rather than personal details or attributes of users. As Barker describes it, the company has taken an approach that it calls “privacy by design,” in which it works only with anonymised data to reach its insights, and that work is not focused on how to use that data to rebuild profiles or “types” that can be used to match people back up with ads or other marketing.

The idea will be to bring that together with Meltwater’s existing business to enhance it.

“By combining advances in machine learning and the vast amount of publicly available information on the internet, you can today understand and track Porter’s Five Forces,” — a framework for analysing a business’ competitors —  “in real time to understand strategic opportunities and threats for your business. Executives that take advantage of this new opportunity create an unfair information advantage over those who don’t,” said Jorn Lyseggen, CEO and founder of Meltwater in a statement.

DataSift has built a scalable platform that lets developers build data science-driven insights from social firehoses while protecting the privacy of an individual’s data. When combined with the data Meltwater captures and our AI capabilities, developers can disrupt the Business Intelligence space by either building new applications or complementing existing ones with unique signal that can be only derived from external data.”

All the same, it will be interesting to see what the affects are on businesses like these. For one, DataSift currently is built on the cooperation (and by the grace) of social networks — by way of APIs and access to firehoses of data that DataSift and companies like it use to feed their analytics engines. Whether the social media companies decide they would like to try their hands at some of that business themselves, or perhaps get told by regulators that they simply can no longer share information in this way, this leaves companies that are built on that access in a precarious position.

And that’s before you consider the effects of existing legislation like GDPR, which Meltwater says is something the company is built to handle.

DataSift’s advanced analytics platform is a great compliment to what we have in house, at a time of growing privacy concerns and regulation such as GDPR,” said Aditya Jami, Senior Director of Engineering and Head of AI at Meltwater. “DataSift’s technology will be instrumental… to deliver next generation insights.”

DataSift in its past had a very notable instance of getting cut off from one of those feeds, and feeling the strong after effects: after years of working closely together and being the main users of Twitter’s firehose of Tweets — access that was brokered when DataSift handed over to Twitter the first third-party website “retweet” button to Twitter, created when DataSift was called TweetMeme — Twitter cut off DataSift from its firehose in the wake of a move to beef up its own big-data analytics business.

DataSift eventually regrouped and now works with around 15 partners, including Facebook, LinkedIn and WordPress, but given that its original premise was based around the kind of real-time data that Twitter uniquely provides, it was a big shift for the startup.

Meltwater has had its own scuffles in the past with the third-party services it relies on to make the wheels of its business model turn. Both have moved on from those more spiky years, it seems.

Fast forward to today, the combination of Meltwater and DataSift makes some sense when you think about the evolution of media. The rise of social networking has created another playing field for businesses: they now have a new set of platforms where they can pick up chatter about themselves, and it’s become the hot new place to communicate with customers.

Whether Facebook wants to admit it or not, social networking has become the modern-day descendent of the old-school media industry, and this is one aspect of that. While DataSift was built on trying to better harness and parse chatter from the former, Meltwater was built on the back of media monitoring to essentially provide the same services on the latter.