Alex Pentland’s article on Data-Driven Society

I recently got the new issue from Scientific American (October 2013), and in the front page was announced the article ‘The Data-Driven Society’ by Alex Pentland.  I just had to read it 🙂

He co-leads the World Economic Forum on Big Data and Personal Data initiatives.  He was talking about all the digital bread crumbs we leave behind on our daily life (like gps and gsm info, or electronic payments) and what can be done with it.

With his students of the MIT Human Dynamics Laboratory, he is discovering mathematical patterns through data analytics that can predict human behaviour. ‘Bread crumbs record our behaviors as it really happens’ he says, it is more accurate than the information from social media, where we choose what we want to disclose from ourselves.  Alex and his team are in particular interested in the patterns of idea flows.

Among the most surprising findings that my students and I have discovered is that patterns of idea flow (measured by purchasing behavior, physical mobility or communications) are directly related to productivity growth and creative output.

Analysing those flows, he uncovered 2 factors that have a positive pattern of healthy idea flow:

  • engagement: connecting to others, usually in the same team or organisation, and
  • exploration: going abroad to exchange ideas.

Both are needed for creativity and innovation to flourish.  To find those factors, he based his research on graphs of different types of interactions, like person-to-person, emails, sms..

We may not have the tools he used (like an electronic badges for tracking person-to-person interactions) but intuitively this is something we know, a good communication is essential for the success of a team, but talking to an external person may provide a new insight.  It’s always good to be proved right, isn’t it?

Check my next post, I’ll continue with his article, there are a lot of great concepts he is presenting as the ‘new deal on data’ for personal data protection.

 

Massive Open Online Courses

Massive Open Online Courses (MOOC) are very recent, but are quickly gaining popularity.  Coursera is one of the big platforms that offer those free courses, along with edX and Khanacademy just to mention a few.  Last year I took a fantastic course offered by Coursera  called ‘Model Thinking’ given  by Prof. Scott E. Page, who’s the Director of the Center for the Study of Complex Systems at the University of Michigan ( I posted already about it here : – ).

In March this year, I was glad to receive a mail from Scott Page, giving us some feedback from his experience doing this course, and sending us also a link to a presentation he did about the making of the course.

To give you an idea of the popularity of this course, there were 60.000 students enrolled on the first run of Model Thinking, beginning of 2012.  It grew to 100.000 for the fall run (by the way, if you are interested there will be a new run this fall 2013, and it may be the last one, says Prof. Page).

I would like to share with you Scott’s insights on his experience on making this online course contrasting it with the making of his online course ‘The hidden Factor’.  This last one was professionally done in a studio and he called ‘Model Thinking’: my garage band online course : – )

In fact, it was really recorded in one unused room of his house, because he said that the starting and stopping of the heating system in the rest of the house was picked up by his mike, so sensible it was even though it was just a $100 one.

To prepare the course, he thought of making it more modular.  So he cut it in small chunks, so that each video was independent, and treated a subject in no more than 15 minutes.  But as he said, that was the easiest part because what took him much more time was the recording of each lecture.  One big issue he had was that he was alone in this room to do the recordings, and trying to be smiling, engaging and enthusiastic is difficult without an audience.  Not only that, but he had unforeseen events from time to time, like his dog wandering around, and he laughed and found himself doing funny movement to chase him.

The editing took a lot of time, each video had to be reviewed, and in case of errors, it was difficult to fix it.  So at the end, some mistakes remained.   On the other hand in the professional approach, they took care of each error, but they had better tools and a battery of technicians to look into them and find different alternatives to correct them.  Sometimes he had to repeat one word they detected he had staggered with, and they told them even the intonation he had to use to repeat it; sometimes they just put a picture about the subject he was talking about, and he could rephrase one sentence.

In conclusion, here’s his comparison regarding costs to do the 2 videos:

 

So it is much more costly for a professional quality. Time-wise, it was surprisingly more or less equivalent:

 

The studio made video was undisputable better, being much easier to correct any mistakes:

 

 

But in the end, is the improvement in quality worth the cost?  Not really he says; the best quality is not needed, a good enough approach is better, even more if the cost prohibits its making.  So the best solution stands between those 2 options.

I found also very important his comment on how presenting this course changed his everyday work life.  He has now 1 hour per day reading his mail, answering to diverse requests on his subject of expertise.  He receives inquiries from technical advisors, deans, diverse influencial people that he cannot really discard.  On the one hand it’s not strictly his job, for what he is paid for, but on the other hand, can these requests be ignored? Is it responsible if you know your intervention can have such an impact as to do better policies, to improve many people’s life?

Snowden showed us the dangers of Big Data with PRISM, are we up to the challenge to steer its use?

A television screen shows former U.S. spy agency contractor Edward Snowden during a news bulletin at a cafe at Moscow’s Sheremetyevo airport June 26, 2013. Credit: Reuters/Sergei Karpukhin

 

As we already discussed on my Big Data presentations,   being able to analyse the amount of data that traces all our actions and movements is a great opportunity to improve our lives, as much as to do business, but it can also be exploited for the worst.  Now Edward Snowden has put a clear case under the spotlights, will this make us move? Will this lead to change?

It’s time to consider what ethical codes and regulations can be issued, so that this excellent opportunity that technology is putting in our hands, that is sharing, measuring and extracting knowledge from all aspects of our lives, is not misused.

Small talk on Big Data

Last week I presented this topic to professional women at PWI here in Brussels. It’s called ‘small talk’ because it is not a technical presentation but one for a broader audience, to create awareness on this Big Data trend.   The main concept I wanted them to take away is the change in the business arena and in our society due to Big Data. If you are interested on this subject, just drop a line and let me know!

Prices of discs and storage devices have dropped a lot, so now basically any digital data is being stored.  Cost is so low, that it is worth to save it ‘just in case, and we’ll see in the future what we can do with this data’.  Technology has made also  huge advances with massive parallel processing, and we can manage to jungle through thousands of servers to analyse a bunch of diverse data and extract information from it in a usable time-frame.

This allows business strategists to make smarter decisions based on facts, better than how it was done before, based on experience or intuition.  So the message for all decision-makers is: go and check your data, you’ll find there valuable information to decide any business matter.  Also, be aware that your competition is going into it too, it can out-smart you!

At the society level, there are many ethical issues to deal with, like privacy or equality and fairness.  What to you think, is it fair to have a subsidy that is ‘personalised’, that may give more to someone than to others because of a particular factor, or allow access to a health treatment to someone and not to another based on his life expectancy for example?  What about basing the decision on his ‘ROI’  like the capability of paying back for the given  treatment?  Or is it more fair to have instead equality on subsidies, same amount for everyone? Even for the ones that could pay it by themselves? Either we discuss them before-hand, or we will be at the mercy of any politician or entrepreneur taking a step deeper in an unethical direction.

And as a last twist, I would like to point out that the basic value of knowledge is challenged.  We are already experiencing a change of values, knowledge is less and less valued as an asset anymore, but value remains in knowing how to get to the knowledge,where to find it and what to extract from data.

 

Women In Tech : There is Hope says Vivak Wadhwa

I loved this article from Vivak Wadhwa, from Stanford University about (the lack of) women in technology.  He was saying that Sillicon Valley seemed to him a meritocracy, as a lot  of nationalities where represented, but then his wife make him see the missing element…women!  You can argue that for a real diversity, we should also look for the representation of other minorities,  but women are not minority,  we are basically half of the population!  Thinking again on the article, more than Vivak, I love his wife :-))

Now the good news of his investigation:

This raised the question: are women less competent as entrepreneurs than men are? Are they not cut out for the rough-and-tumble world of entrepreneurship? The answer turned out to be none of this. An analysis performed by the Kauffman Foundation showed that women are more capital-efficient than men. Babson’s Global Entrepreneurship Monitor found that women-led high-tech startups have lower failure rates than those led by men. Other research has shown that venture-backed companies run by women have annual revenues 12 percent higher than those by men and that organizations that are the most inclusive of women in top management positions achieve a 35% higher return on equity and 34% higher total return to shareholders.

So men, find yourself a woman partner for your next business, you’ll have a competitive advantage from the start 😉

 

Big Data, a trend to follow for business innovation

Every day more than 2.5 quintillion (2.5 x 1018 ) bytes of data are created, coming from business and bank transactions, posts on social media sites, digital photos, videos, and other sensors as GPS signal and more. Big Data is the name of the mass of unstructured data available nowadays on Internet.

All this large amount of data is a big resource, and many of these sets of data are available to everybody. Some companies are exploiting it already, you may have guessed that Google looks at the subjects you are interested in, and presents you with ads related to that content. Facebook for example looks at the friends of your friends to suggest you new contacts. Other examples are less obvious, but plenty of business good sense, like an airplane company improving the pilot’s ETA of a flight using, among others, weather and aerial traffic information.  The new ETA is more accurate, and allows to reduce idle time at airports.  The McKinsey Global Institute calls Big Data ‘the next frontier for innovation, competition and productivity’.

The European Commission believes that ‘data is the new gold’. To boost the economy they have created the Open Data Initiative that aims at opening up Public Sector Information.  As they put it:

Public sector  information (PSI) is the single largest source of information in Europe. It is  produced and collected by public bodies and includes digital maps,  meteorological, legal, traffic, financial, economic and other data. Most of this  raw data could be re-used or integrated into new products and services, which we use on a daily basis, such as car  navigation systems, weather forecasts, financial and insurance services.

Re-use of public sector information means using it in new  ways by adding value to it, combining information from different sources, making  mash-ups and new applications, both for commercial and non-commercial purposes. Public sector information has great economic potential. [..] Increase  in the re-use of PSI generates new businesses and jobs and provides consumers  with more choice and more value for money.

And they are not the only ones, the UN has also it own open data initiative, so it’s time to let your imagination fly and ask yourself what information could help your business, as unimaginable as it could have been to count with it before. Managers could now make decisions based on real data analysis.  There are many sectors where you can generate financial value from Big Data, the MacKinsey Global Institute points out among them health care, the public sector administration, global personal location data, retail and manufacturing.

From the technological perspective, exploiting Big Data is a great challenge. All these data come from different sources, are stored on different locations, in different formats, so navigating through it is not an easy task. Up to now, companies were using their own stored data to do their business. They defined the format, created the metadata (information on how to interpret each content, what meant each bit of information), used consistently throughout the company. For this kind of data (called ‘structured data’) there are a number of proven techniques that allow manipulating the data usually stored in ‘databases’ or ‘data-warehouses’ and giving answers for the business management.

But when it comes to unstructured data, it’s really another business. And not only there is a challenge as we mentioned earlier on navigating through data from different locations, changing from one format to another, but also dealing with the huge volume of data: think of the quantity of bytes that have to be analysed! Also, to be worth the effort, it has to be done on time. That is giving the answer to a question when it still matters (in some cases it can be days or hours, in others like for a car guidance program, it is measured in seconds). This is really hard, and classical programs don’t stand to the challenge. There are new algorithms being created, different initiatives under construction, that are fighting to gain movement and become standards. For me, this trend is worth following.  If you are interested, check Roberto Zicari’s presentation, from ODBMS.org

Crowdsourcing is a flourishing market

CrowdFlower Reports Revenue of 300% Year Over Year and More Than 300 Million Enterprise Level Crowdsourced Microtasks Completed, Earning #1 Rank in Industry

CrowdFlower is a microtasking crowdsourcing enterprise.  The company solves information-based problems like product categorization, SEO content creation, web verifications, etc… by splitting the task into small pieces (micro-tasks), and giving them to their workers ( their on-demand contributors world-wide).  They take care of quality issues, and aggregate the results to answer their client’s request.    They have completed 300 million tasks for their customer companies  including eBay, Microsoft, and Twitter.   The Daily Crowdsource has recently published the CrowdCensus report, rating them #1 among industry leaders in micro-tasking.

– In 2011, millions of tasks were performed for virtual goods in Facebook games, with CrowdFlower contributors performing real work (such as sentiment analysis and categorization) for virtual goods or other rewards.

– CrowdFlower has a workforce of more than 2 million individual contributors producing approximately 4 man-years of work daily. In other words, it would take one person four years of work to complete what CrowdFlower’s virtual workforce does in a single day.

[…]  A variety of factors contribute to their success. “Their platform sits on a robust system that does not ignore security, quality, or scalability,” said the Daily Crowdsource report. “CrowdFlower balances the combination of cost and quality through the use of Gold Standard Data, redundancy, and peer review.” [said Woody Hobbs, the CEO from Crowdflower]

CrowdFlower is not an isolated case, the microtasking industry has done great for the last years, and is in great expansion world-wide.

 

Microtasks market grew almost 4 times in 2011

Check this article from David Bratvoldt, Enterprise Crowdsourcing blasts off as social media growth industry.

His research  forsees a growth of the crowdsourcing microtask sector of around 355%  this year.  I am interested in your opinion: Are you hearing about crowdsourcing  in the enterprises around you? Do you think it’s good or bad for our world, with the economic crisis in which we are immersed?

As buzzwords go, “crowdsourcing” may not be as big as ”social-media” or “mobile apps” but new research show it is one of the most rapidly-expanding trends in our field. Crowdsourcing represents an epic shift in the world of labor, automation, and information science, one with large economic and ethical implications.

[…] To answer these questions accurately, we took the last three months to perform a thorough analysis of enterprise-grade microtasking vendors and produced a market report.  We chose the ‘microtasking’ sector to start with because it’s one of the two sectors that enterprises can benefit from the most. Here’s what we found: There are currently six enterprise-grade microtasking providers: Clickworker, CrowdFlowerCrowdSource, Microtask, Microworkers, & Serv.io
(aka CloudCrowd).
[…] The market demand for crowd-sourced work quintupled in 2010 & almost quadrupled in 2011:

Despite being around for six years, the microtasking field was in the testing phase for the first three years.  Several platforms were revamped, relaunched, or finally “released” in 2009.  Client adoption was also slow until 2009 when the first surge in market demand occurred.  Last year, the number of completed microtasks increased 496% over 2009.  The number of tasks completed in 2011 is estimated to increase 355%

TEDxBrussels, la suite…

Julie Meyer, founder and CEO of Ariadne Capital,  was also very interesting calling the economy in our 21st. century the ‘Individualist Capitalism’.  Her very clever message to all entrepreneurs was: Look at your ‘Natural Allies’, ask yourself who has interest in your success.  They are going to be there and help you to be successful!

On privacy and the lack of forgetfulness of Internet, Miko Hypponen, let me thinking with his question: ‘Do you trust your future governments with what you said today?’

Alain De Taeye, co-founder of Tele Atlas, did a very realistic presentation.  He talked about TomTom, and the fact that they are predicting their user’s immediate future, by letting them know what’s ahead of them: traffic jam, accidents, so that the driver can decide to take another route and avoid that near future.   Who can say positively that being informed of our possibilities in the path in front of us is not, as he said, predicting our future?  It is like when you learn a magician trick: it’s no longer magic.  Will we have the same feeling when we will be able to know our future?  Will it seem so obvious to us?

That’s all for today, folks!

TEDxBrussels in XL format

Yesterday was the great day. Eleven hours of TED, full of  talks of around 8 minutes each:  what a challenge for the speakers, and for the audience!  It was tiring, but worth it.  What did I like the most?  By far and on a different register than the other speakers, Paddy Ashdown:  what a clear picture of globalisation, the playing forces and the need for governance.  So good I’m happy we will have it on video to hear it again, and pass it on to my friends.

Now I hope you will excuse me if my AI background makes me mention more in detail the talks about Robotics 🙂  I was impressed by the Geminoid.  They had some little problems for the demo, but the ressemblance and his facial expressions seemed very real.  How human must the gemanoid look that on my sentence I said ‘his face’ and not ‘its face’ 🙂  I thought about it, but it didn’t feel wrong.  On the same category, Luc Steels presented an altogether different aspect of the robots: not in the human look-like, but on the learning behaviour:  they initialise a ‘mental state’ for robots that can be downloaded on a Sony robotic harwdare.  Each mental state evolves through the interaction with another robot, trying to communicate, creating and learning words that represent the objects of their world. I see the robots going through our evolutionary steps, at a drastically different pace than us.  It is not ‘if’ anymore but ‘when’ : When will be the moment we will consider them sentient? How society will react to that?

I don’t want to end without mentioning it.  We even had a ballet of electrons on scene.  Grandiose!  Physics explained through danse. Matter, laser movements, quantum mechanics, the flow… I really encourage all science teachers to use this video in their class, to explain those concepts to the kids.  So easy to understand, so visual!

For the other talks, very interesting too, you will have to come back, that’s all for this post.