Bits Of Knowledge

A Blog about Machine Learning, Data Privacy and what it takes to make sense of the digital words in the rise of the digital millennium.

Nine Key Questions To Evaluate A New Technology

Written By: Corina Ciechanow - May• 31•13

Technical and scientific knowledge is growing at a fast speed.  This allows for a lot of experimentation and innovation but in order to do it, you need first to acquire very specific knowledge, to educate yourself in many technical disciplines.

This is not at everybody’s reach.  In particular, it is hardly at reach of non-technical decision-makers, as business managers or politicians. How can we expect politicians and the general public to be able to take an ethical or societal position, to legislate, on issues they don’t have the background knowledge nor the time (or will) to learn?

Nevertheless, some of these new inventions need legislation, and definitely you need to be aware of it as a business decision maker.  Here’s a practical approach to tackle this issue: use critical thinking, using contextual knowledge instead of technical one.

I’m extrapolating here from Miguel Aznar’s 9 questions to tackle the nanotechnology issue, in order to get a contextual comprehension of any technical issue:

  1. What is this new technique?
  2. Why do we use it?
  3. Where does it come from?
  4. How does it work? (just roughly, you don’t have to get too technical, exploit analogies )
  5. How is it evolving?
  6. How is this technology changing us (as an individual, as a society)?
  7. How could we change/adapt it?
  8. What are the pros and cons?
  9. How to evaluate/measure it?

This is a good set of questions to remember before taking action, don’t you agree?

photo by: x-ray delta one

Is your Robot feeling lonely? Connect it to RoboEarth

Written By: Corina Ciechanow - Apr• 17•13

It (or he/she?) doesn’t need to, there is now a platform to connect to others.  I wouldn’t call it the Facebook for Robots, it’s more like a giant Academia :-)   but RoboEarth enable robots to share their experiences, their learnings.  It is a Cloud environment that allows them also to use external storage and computation capabilities, that means freeing them of physically  carrying the extra kilos of storage space or processor needed to execute their tasks.
See the official definition:

What is RoboEarth?

At its core, RoboEarth is a World Wide Web for robots: a giant network and database repository where robots can share information and learn from each other about their behavior and their environment. Bringing a new meaning to the phrase “experience is the best teacher”, the goal of RoboEarth is to allow robotic systems to benefit from the experience of other robots, paving the way for rapid advances in machine cognition and behaviour, and ultimately, for more subtle and sophisticated human-machine interaction.

RoboEarth offers a Cloud Robotics infrastructure, which includes everything needed to close the loop from robot to the cloud and back to the robot. RoboEarth’s World-Wide-Web style database stores knowledge generated by humans – and robots – in a machine-readable format. Data stored in the RoboEarth knowledge base include software components, maps for navigation (e.g., object locations, world models), task knowledge (e.g., action recipes, manipulation strategies), and object recognition models (e.g., images, object models).

I think this platform will make an exponential leap on robots capabilities.  It is sometimes hard for humans to learn by example, but it is not so for robots.

And isn’t this like crowdsourcing between robots?

 

Internet of Things and the Power of Feedback Loops

Written By: Corina Ciechanow - Apr• 01•13

This 2011 Wired post from Thomas Goetz about Feedback Loops is about how we can change (or improve) our behaviour just by measuring it.  I would add another factor that I think is as important to make us change, that is when we put our behavior on display. As example, if you have been on a diet, you know the valuable help of letting people know you are on a diet, social pressure will help you to keep in track and reach your goal.  Thomas Goetz gives the example of showing your speed:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The potential of the feedback loop to affect behavior was explored in the 1960s, most notably in the work of Albert Bandura, a Stanford University psychologist and pioneer in the study of behavior change and motivation. Drawing on several education experiments involving children, Bandura observed that giving individuals a clear goal and a means to evaluate their progress toward that goal greatly increased the likelihood that they would achieve it. He later expanded this notion into the concept of self-efficacy, which holds that the more we believe we can meet a goal, the more likely we will do so. In the 40 years since Bandura’s early work, feedback loops have been thoroughly researched and validated in psychology, epidemiology, military strategy, environmental studies, engineering, and economics. (In typical academic fashion, each discipline tends to reinvent the methodology and rephrase the terminology, but the basic framework remains the same.) Feedback loops are a common tool in athletic training plans, executive coaching strategies, and a multitude of other self-improvement programs (though some are more true to the science than others).

Despite the volume of research and a proven capacity to affect human behavior, we don’t often use feedback loops in everyday life. Blame this on two factors: Until now, the necessary catalyst—personalized data—has been an expensive commodity. Health spas, athletic training centers, and self-improvement workshops all traffic in fastidiously culled data at premium rates. Outside of those rare realms, the cornerstone information has been just too expensive to come by. As a technologist might put it, personalized data hasn’t really scaled.

Second, collecting data on the cheap is cumbersome. Although the basic idea of self-tracking has been available to anyone willing to put in the effort, few people stick with the routine of toting around a notebook, writing down every Hostess cupcake they consume or every flight of stairs they climb. It’s just too much bother. The technologist would say that capturing that data involves too much friction. As a result, feedback loops are niche tools, for the most part, rewarding for those with the money, willpower, or geeky inclination to obsessively track their own behavior, but impractical for the rest of us.

Remember this was written 2 years ago, and at the pace of technological advances, the limitations he saw on collecting and storing personal data are not so relevant anymore.

For you, entrepreneur reader, check the article for the provided examples, and hereunder for some general ideas that can make use of this loop model:

And today, their promise couldn’t be greater. The intransigence of human behavior has emerged as the root of most of the world’s biggest challenges. Witness the rise in obesity, the persistence of smoking, the soaring number of people who have one or more chronic diseases. Consider our problems with carbon emissions, where managing personal energy consumption could be the difference between a climate under control and one beyond help. And feedback loops aren’t just about solving problems. They could create opportunities. Feedback loops can improve how companies motivate and empower their employees, allowing workers to monitor their own productivity and set their own schedules. They could lead to lower consumption of precious resources and more productive use of what we do consume. They could allow people to set and achieve better-defined, more ambitious goals and curb destructive behaviors, replacing them with positive actions. Used in organizations or communities, they can help groups work together to take on more daunting challenges. In short, the feedback loop is an age-old strategy revitalized by state-of-the-art technology. As such, it is perhaps the most promising tool for behavioral change to have come along in decades.

I like his way of turning the expression around:

[...] But as GreenGoose, Belkin, and other companies begin to use sensors to deploy feedback loops throughout our lives, we can finally see the potential of a sensor-rich environment. The Internet of Things isn’t about the things; it’s about us.

 

Small talk on Big Data

Written By: Corina Ciechanow - Mar• 28•13

Last week I presented this topic to professional women at PWI here in Brussels. It’s called ‘small talk’ because it is not a technical presentation but one for a broader audience, to create awareness on this Big Data trend.   The main concept I wanted them to take away is the change in the business arena and in our society due to Big Data. If you are interested on this subject, just drop a line and let me know!

Prices of discs and storage devices have dropped a lot, so now basically any digital data is being stored.  Cost is so low, that it is worth to save it ‘just in case, and we’ll see in the future what we can do with this data’.  Technology has made also  huge advances with massive parallel processing, and we can manage to jungle through thousands of servers to analyse a bunch of diverse data and extract information from it in a usable time-frame.

This allows business strategists to make smarter decisions based on facts, better than how it was done before, based on experience or intuition.  So the message for all decision-makers is: go and check your data, you’ll find there valuable information to decide any business matter.  Also, be aware that your competition is going into it too, it can out-smart you!

At the society level, there are many ethical issues to deal with, like privacy or equality and fairness.  What to you think, is it fair to have a subsidy that is ‘personalised’, that may give more to someone than to others because of a particular factor, or allow access to a health treatment to someone and not to another based on his life expectancy for example?  What about basing the decision on his ‘ROI’  like the capability of paying back for the given  treatment?  Or is it more fair to have instead equality on subsidies, same amount for everyone? Even for the ones that could pay it by themselves? Either we discuss them before-hand, or we will be at the mercy of any politician or entrepreneur taking a step deeper in an unethical direction.

And as a last twist, I would like to point out that the basic value of knowledge is challenged.  We are already experiencing a change of values, knowledge is less and less valued as an asset anymore, but value remains in knowing how to get to the knowledge,where to find it and what to extract from data.

 

Women In Tech : There is Hope says Vivak Wadhwa

Written By: Corina Ciechanow - Mar• 27•13

I loved this article from Vivak Wadhwa, from Stanford University about (the lack of) women in technology.  He was saying that Sillicon Valley seemed to him a meritocracy, as a lot  of nationalities where represented, but then his wife make him see the missing element…women!  You can argue that for a real diversity, we should also look for the representation of other minorities,  but women are not minority,  we are basically half of the population!  Thinking again on the article, more than Vivak, I love his wife :-) )

Now the good news of his investigation:

This raised the question: are women less competent as entrepreneurs than men are? Are they not cut out for the rough-and-tumble world of entrepreneurship? The answer turned out to be none of this. An analysis performed by the Kauffman Foundation showed that women are more capital-efficient than men. Babson’s Global Entrepreneurship Monitor found that women-led high-tech startups have lower failure rates than those led by men. Other research has shown that venture-backed companies run by women have annual revenues 12 percent higher than those by men and that organizations that are the most inclusive of women in top management positions achieve a 35% higher return on equity and 34% higher total return to shareholders.

So men, find yourself a woman partner for your next business, you’ll have a competitive advantage from the start ;-)

 

Railways Powered by Inspire

Written By: Corina Ciechanow - Mar• 05•13

I was at the conference Powered by Inspire here in Brussels.  It was all about geo-spatial European Standardization.

Erika Nissi, from the International Union of Railways, spoke about their particular situation regarding this European Directive.  They want to move on that direction, and they will comply eventually, but railways have a lot of other Directives and Regulations to which they must comply too, they will be focusing primarily on their business needs.

In particular, they have to create by the end of 2015 a full European dataset with the railway infrastructure information from each country, and geo-spatial location is only part of it.  They face problems similar to the other economic sectors of quality of data, granularity (for example the junction information foreseen in Inspire is not detailed enough for their operational needs) and on top of that, the lack of means because of the economic crisis.

In the future they will contribute with a unified technical infrastructure dataset, but also consume many of other available cross-border information, like land leveling, city layouts or vegetation information when the train crosses a forest.  Knowing urbanization plans would help them plan new lines; traffic density will help adjusting to the needed train capacities; weather forecasts will improve the estimated time schedule,  improving customer satisfaction.  They could also evaluate the economic impact of new TGV lines…

I’ll just stop here, if you can imagine more possibilities, please leave them in a comment :-)

 

 

 

The Bad Side of Personalisation

Written By: Corina Ciechanow - Jan• 30•13

As I mentioned in my previous post, there is a vast amount of data available on the Internet, a lot of potential information. It is fantastic all the insights we can get from it not only for our businesses, but also for us as consumers, as users of Internet.

Lately, when you look for something on  Google, as it is getting ‘wiser’ and better at guessing  what you aim for,  you are just presented quicker with the right information, and this is super,  isn’t it?   Also, if you have a new hobby, and you are being targeted by ads related to it, well… even though it may not be good for your wallet, I’m sure you will be enjoying your newly bought gadgets.   All applications are trying to ‘personalise’ their interactions, to be more specific in order to get your attention.

As always, good things can have a drawback side.  When you search on a subject and the results that are shown to you are the ones that better match your centers of interest, then, you are left out of other diverse information on the subject.  Yes, you may say this ‘other’ results are surely there if you scroll for it… on the third page maybe?  But who looks even to the second page of results nowadays?

So, on the one hand, results match potentially what we are looking for, but on the other hand, we are not being shown the ‘other side’ of the world.  We are not being presented with other points of view.   And this is critical. I loved this TED talk by Jonathan Haidt about The moral roots of liberals and conservatives that makes a clear case of this point.  Diversity is a good thing, and has to be promoted, not making its access more difficult.

 Knowing this, the best we can do about it, for the sake of humanity, is to draw the attention of apps designers so that they think of a way to balance their algorithms to avoid this pervert effect of personalization.  Spread the word!

 

 

Big Data, a trend to follow for business innovation

Written By: Corina Ciechanow - Dec• 23•12

Every day more than 2.5 quintillion (2.5 x 1018 ) bytes of data are created, coming from business and bank transactions, posts on social media sites, digital photos, videos, and other sensors as GPS signal and more. Big Data is the name of the mass of unstructured data available nowadays on Internet.

All this large amount of data is a big resource, and many of these sets of data are available to everybody. Some companies are exploiting it already, you may have guessed that Google looks at the subjects you are interested in, and presents you with ads related to that content. Facebook for example looks at the friends of your friends to suggest you new contacts. Other examples are less obvious, but plenty of business good sense, like an airplane company improving the pilot’s ETA of a flight using, among others, weather and aerial traffic information.  The new ETA is more accurate, and allows to reduce idle time at airports.  The McKinsey Global Institute calls Big Data ‘the next frontier for innovation, competition and productivity’.

The European Commission believes that ‘data is the new gold’. To boost the economy they have created the Open Data Initiative that aims at opening up Public Sector Information.  As they put it:

Public sector  information (PSI) is the single largest source of information in Europe. It is  produced and collected by public bodies and includes digital maps,  meteorological, legal, traffic, financial, economic and other data. Most of this  raw data could be re-used or integrated into new products and services, which we use on a daily basis, such as car  navigation systems, weather forecasts, financial and insurance services.

Re-use of public sector information means using it in new  ways by adding value to it, combining information from different sources, making  mash-ups and new applications, both for commercial and non-commercial purposes. Public sector information has great economic potential. [..] Increase  in the re-use of PSI generates new businesses and jobs and provides consumers  with more choice and more value for money.

And they are not the only ones, the UN has also it own open data initiative, so it’s time to let your imagination fly and ask yourself what information could help your business, as unimaginable as it could have been to count with it before. Managers could now make decisions based on real data analysis.  There are many sectors where you can generate financial value from Big Data, the MacKinsey Global Institute points out among them health care, the public sector administration, global personal location data, retail and manufacturing.

From the technological perspective, exploiting Big Data is a great challenge. All these data come from different sources, are stored on different locations, in different formats, so navigating through it is not an easy task. Up to now, companies were using their own stored data to do their business. They defined the format, created the metadata (information on how to interpret each content, what meant each bit of information), used consistently throughout the company. For this kind of data (called ‘structured data’) there are a number of proven techniques that allow manipulating the data usually stored in ‘databases’ or ‘data-warehouses’ and giving answers for the business management.

But when it comes to unstructured data, it’s really another business. And not only there is a challenge as we mentioned earlier on navigating through data from different locations, changing from one format to another, but also dealing with the huge volume of data: think of the quantity of bytes that have to be analysed! Also, to be worth the effort, it has to be done on time. That is giving the answer to a question when it still matters (in some cases it can be days or hours, in others like for a car guidance program, it is measured in seconds). This is really hard, and classical programs don’t stand to the challenge. There are new algorithms being created, different initiatives under construction, that are fighting to gain movement and become standards. For me, this trend is worth following.  If you are interested, check Roberto Zicari’s presentation, from ODBMS.org

Visualising Big Data

Written By: Corina Ciechanow - Nov• 25•12

With all the generated electronic data, there is a new way of studying sociology.  We can measure now what’s happening in real time on a particular event.  It means a lot of computation, the new techniques to navigate on extremely large data-sets have to be used, but there is also a challenge on how to present the results of all these analysis.  If you have a report of 150 pages full of numbers, it is not easily presented to the general public.

Fortunately, there are new ways of presenting results than traditional diagrams, tools that allow visualising complex statistics or concepts.  Look at the interactive graphic made by JESS3 on the article from The New York Times: Four Ways to Slice Obama’s 2013 Budget Proposal

Visualisation is becoming more and more important, to make people understand this level of data.  Proliferation of content makes it difficult to make sense of it, visualisation is putting it in a way that is digestible.  JESS3 transformed a 150 pages economic report  in a 6 minutes automation presentation.  It has been presented in a forum as a video, and presenters could see from the posture of the public that they were captivated, following this presentation.

Check an interview made to Leslie Bradshaw, co-founder of JESS3, by Google Developers Live series GDL Presents: Women Techmakers with JESS3

 

CNIL Conclusions on Google’s privacy policy

Written By: Corina Ciechanow - Oct• 20•12

Just a few days ago, the European authorities on Google’s respect of the European Directive on Privacy published their conclusion.  Basically they indicated that Google failed respecting essential principles of the Directive as  limiting the usage of the personal data, minimising the requested personal data and the right to object.

  • There is not enough information on the nature and usage of the collected data,
  • The way users can control their level of privacy is too complicated,
  • The data collected is not minimized for the purpose.
  • The retention periods are not specified.

As the CNIL  puts it:

..it is not possible to ascertain from the analysis that Google respects the key data protection principles of purpose limitation, data quality, data minimization, proportionality and right to object.[...]

Under the current Policy, a Google service’s user is unable to determine which categories of personal data are processed for this service, and the exact purposes for which these data are processed.

E.g.: the Privacy Policy makes no difference in terms of processing between the innocuous content of search query and the credit card number or the telephone communications of the user ; all these data can be used equally for all the purposes in the Policy.

Moreover, passive users (i.e. those that interact with some of Google’s services like advertising or ‘+1′ buttons on third-party websites) have no information at all.

On the combination of data accross services, the change Google just did, the CNIL says:

Combination of data across services has been generalized with the new Privacy Policy: in practice, any online activity related to Google (use of its services, of its system Android or consultation of third-party websites using Google’s services) can be gathered and combined.

The European DPAs note that this combination pursues different purposes such as the provision of a service requested by the user, product development, security, advertising, the creation of the Google account or academic research. The investigation also showed that the combination of data is extremely broad in terms of scope and age of the data.

E.g.: the mere consultation of a website including a ‘+1′ button is recorded and kept during at least 18 months and can be associated with the uses of Google’s services; data collected with the DoubleClick cookie are associated to a identifying number valid during 2 years and renewable

Here are the recommentadions they made to Google to tackle the combined data accross services:

  • reinforce users’ consent to the combination of data for the purposes of service improvements, development of new services, advertising and analytics. This could be realized by giving users the opportunity to choose when their data are combined, for instance with dedicated buttons in the services’ (cf. button “Search Plus Your World”),
  • offer an improved control over the combination of data by simplifying and centralizing the right to object (opt-out) and by allowing users to choose for which service their data are combined
  • adapt the tools used by Google for the combination of data so that it remains limited to the authorized purposes, e.g. by differentiating the tools used for security and those used for advertising.

But there is a good news for us citizens from this issue:

This letter [a letter to Google with the recommendations of the EU Data protection authorities] is individually signed by 27 European Data protection authorities for the first time and it is a significant step forward in the mobilization of European authorities.

Let’s hope next Google’s Data Privacy Policy will be soon here to adopt.