Book summary AI super-powers, China, silicon valley and the new world order by Kai Fu Lee

The difference waves of AI

  • Internet AI – Facebook, Netflix, Google search
  • Business AI – Palantir
  • Perception AI – Tesla cars
  • Autonomous AI – Tesla self driving, Google self driving

Key locations

  • Silicon Valley
  • Zhong Guan Cun – Beijing

State of the Union

  • We are in the stage of implementation/application as opposed to RnD
    • having access to more data is more important than have expertise to do more RnD
    • having solid AI engineers is more important than AI researchers
  • We are still far from general AI
  • Key ingredients
    • data
    • computing
    • maybe work of strong AI algorithms engineers

Key differences between eco-systems

  • Silicon Valley businesses are mission and core values driven while Chinese businesses are pragmatically focused on profitability.
  • Silicon Valley businesses stay in bits and binaries offloading the brick and mortar to external vendors vendors while Chinese businesses extend their business model into the brick and mortar (online to offline)
  • Silicon valley prefers one size fit all strategy, Chinese businesses utilized localized solutions often investing/acquiring in local startups
  • Americans treat search engines like Yellow Pages (come and leave fast) while Chinese treat search engines like shopping mall (come to linger around long)
  • Silicon Valley is adversed to copying preferring to be unique Chinese business copy the heck out of each other

Chinese Advantage

  • Abundant data – quality and quantity aided by their online to offline initiatives
  • hungry entrepreneurs
  • AI scientist
  • AI friendly policy environment – strong emphasis by Chinese government
  • Hardware manufacturing know how – Shen Zhen
    • unparalleled supply chain flexibility – XiaoMi

Silicon Valley Advantage

  • Microchip manufacturing know-how

Trends within the Chinese eco-system

  • Darwinian eco-system has lead to extreme levels of competition
  • Chinese companies have already moved past the stage of clone Silicon Valley business models
  • Businesses innovate to build a defensive moat around themselves. Local businesses have advantage, with no timezone differences to deal with, decision making is relatively faster.
  • Online to offline
    • an essential ingredient to building strategic moats
    • caused the decline of cash use
  • Chinese government information systems will be able to leap frog US government information systems

Policy approaches

  • Google – impeccable safety
  • Tesla / China – trial by fire
  • key to winning the Autonomous AI race
    • is the bottleneck technology (Silicon Valley) or policy (China)?

Key concerns

  • having cheap labor is no longer going to be a source of advantage in a world heavily powered by automaton.  Developing countries hoping to employ this well tested strategy to progress will not be able to do so anymore
  • Estimated 60% potential job loss worldwide barring policy interventions
  • Job loss probability assessment
    • physical labor
      • environment – unstructured versus structure
      • tasks nature – level of dexterity versus high dexterity
    • cognitive labor
      • social – high versus low
      • cognitive – optimization based versus creativity/strategy based
  • AI replacement approach
    • single tasks approach
    • ground up rethink re-imagination
  • A population of irrelevant (no longer employable) as opposed to unemployed

Tackling Key concerns

  • Silicon valley – reduce, retrain and redistribute
  • Kai Fu Lee – stipends for care, service, education

New promise

  • Humans freed up from repetitive tasks can now focus on becoming more human oriented

Related readings

  • Disruptor, Zhou
  • www.Arvix.org – an online repository of scientific papers
  • Folding Beijing – Hao JingFang

Mark Zuckerberg chats with Yuval Noah Harrai on the Future of AI

Key take aways

  • Spread of inequality where some countries have the ability to harness AI while others don’t
  • AI based recommendation systems moving from being just an oracle to becoming a sovereign
  • AI as a tool is an amplifier
    • concerns that it will benefit totalitarianism more than democracy leading to totalitarianism becoming a more favorable governance model worldwide
    • surveillance
    • psychological manipulation – the inability to know your true self through your thoughts
    • what happens if morality and expediency diverge when it comes to governance
  • Effectiveness of curbing the negative effects of AI by encoding values within policy frameworks governing these AI based systems
    • Companies based in Democratic countries will encode democratic values within their systems vice versa for Totalitarian countries
  • Personalization versus Fragmentation
    • when everyone in a country chooses his own community that is mainly online there is no longer a glue holding the local community together
  • Long term versus short term
    • The long term benefits might come sooner than expected when taking a short term trade off

 

Key take aways from the Blockchain revolution

The Blockchain revolution
  • ensuring the integrity of data exchanged among these billions of devices without the need for a trusted third party
  • allow people that do not have access to the service of these third part into the digital economy
  • easier ability to get compensated for your work or ownership of digital property
  • Ronald Coase on types of costs
    • Searching cost
    • Coordination cost
    • Contracting cost
  • Dimensions of search
    • horizontal search – wide search across the web
    • vertical search – within a specific website
    • sequence – blockchain?
  • Innovation typically comes from the edge
    • monopolies have a lot of resources but lack the culture and will to explore, Yochai Benkler
    • This can be attributed to high levels of bureaucracy within the core

Related references

Highlights from The Future of Humanity by Michio Kaku

  • Organisms on earth eventually will meet one of three fates, leave, adapt or die. Earth has already sustained 5 extinction cycles.
  • The threats we face are largely self inflicted
  • Scientific revolution comes in waves often stimulated by advances in physics
    • 19th century
      • mechanics and thermodynamics: locomotive and industrial revolution
    • 20th century
      • electricity and magnetism bring forth the electric age
    • The forthcoming wave
      • nano-technology
      • AI, neural networks
      • Quantum computing
      • CRISPR revolution
      • Transhumanism – the need to deal with ethical questions
  • Technological regression occurs when the population becomes complacent,
    • Admiral Zheng He and his fleet under subsequent rulers
    • US space program after the cold war is over
  • Interesting phenomena worth exploring
    • wormholes
    • rogue planets – planets that do not orbit any particular stars
    • caloric restriction and increased life expectancy
    • falling birthrates and education of women
    • uploading and downloading of consciousness (Transcendence and Mnemonic Johnny)
    • achievement of super strength on new planets
    • artificial enhancement of body, seamless interfacing with machines (telekinesis)
    • big bang happening over and over again and the universe does not grow only in one direction
  • Civilization categorization method 1:
    • energy based
      • Type 1: utilizes all the energy of the sunlight falling on the planet
      • Type 2: utilizes all energy its sun produces
      • Type 3: utilizes energy of an entire galaxy
      • Type 4: utilizing energy beyond the galaxy
    • information consumption based

Navigating the trough of sorrow

While I was reading through most of the success stories that were published on IndieHackers.com, it occurred to me that my project GetData.IO really took longer than most others to gain significant traction, a full 5 years actually.

The beginning

I first stumbled upon this project back in December 2012 when I was trying to solve two other problems of my own.

In my first problem, I was trying to identify the best stocks to buy on the Singapore Stock Exchange. While browsing through the stocks listed on their website, I soon realize that most stock exchanges as well as other financial websites gear their data presentation towards quick buy and sell behaviors. If you were looking to get data for granular analysis based on historical company performance as opposed to stock price movements, its like pulling teeth. Even then, important financial data I needed for decision making purposes were spread across multiple websites. This first problem lead me to write 2 web-scrappers, one for SGX.com and the other for Yahoo Finance, to extract data-sets which I later combined to help me with my investment decision-making process.

Once I happily parked my cash, I went back to working on my side project then. It was a travel portal which aggregates all the travel packages from tour agencies located in Southeast Asia. It was not long before I encountered my second problem… I had to write a bunch of web-scrapers again to pull data from vendor sites which do not have the APIs! Being forced to write my 3rd, 4th and maybe 5th web-scraper within a single week lead me to put on hold all work and step back to look at the bigger picture.

The insight

Being a web developer, and understanding how other web developers think, it quickly occurred to me the patterns that repeat themselves across webpage listings as well as nested webpages. This is especially true for naming conventions when it came to CSS styling. Developers tend to name their CSS classes the way they would actual physical objects in the world.

I figured if there existed a Semantic Query Language that is program independent, it would provide the benefit of querying webpages as if they were database tables while providing for clean abstraction of schema from the underlying technology. These two insights still prove true today after 6 years into the project.

The trough of sorrow

While the first 5 years depicted in the trend line above seem peaceful due to a lack of activity, it felt anything but peaceful. During this time, I was privately struggling with a bunch of challenges.

Team management mistakes and pre-mature scaling

First and foremost was team management. During the inception of the project my ex-schoolmate from years ago approached me to ask if there was any project that he could get involved in. Since I was working on this project, it was a natural that I would invited him to join the project. We soon got ourselves into an incubator in Singapore called JFDI.

In hindsight, while the experience provided us with general knowledge and friends, it really felt like going through a whirlwind. The most important piece of knowledge I came across during the incubation period was this book recommendation?—?The Founder’s dilemma. I wished I read the book before I made all of the mistakes I did.

There was a lot of hype (see the blip in mid-2013), tension and stress during the period between me and my ex-schoolmate. We went our separate ways due to differences in vision of how the project should proceed shortly after JDFI Demo Day. It was not long before I grew the team to a size of 6 and had it disbanded, realizing it was naive to scale in size before figuring out the monetization model.

Investor management mistakes

During this period of time, I also managed to commit a bunch of grave mistakes which I vow never to repeat again.

Mistake #1 was being too liberal with the stock allocation. When we incorporated the company, I was naive to believe the team would stay intact in its then configuration all the way through to the end. The cliff before vesting were to begin was only 3 months with full vesting occurring in 2 years. When my ex-schoolmate departed, the cap table was in a total mess with a huge chunk owned by a non-operator and none left for future employees without significant dilution of existing folks. This was the first serious red-flag when it came to fund raising.

Mistake #2 was giving away too much of the company for too little, too early in the project before achieving critical milestones. This was the second serious red-flag that really turned off follow up would-be investors.

Mistake #3 was not realizing the mindset difference of investors in Asia versus Silicon Valley, and thereafter picking the wrong geographical location (a.k.a network) to incubate the project. Incubating the project in the wrong network can be really detrimental to its future growth. Asian investors are inclined towards investing in applications that have a clear path to monetization while Silicon Valley investors are open towards investing in deep technology of which the path to monetization is yet apparent. During the subsequent period, I saw two similar projects incubated and successfully launched via Ycombinator.

The way I managed to fix the three problems above was to acquire funds I didn’t yet have by taking up a day job while relocating the project to back to the Valley’s network. I count my blessings for having friends who lend a helping hand when I was in a crunch.

Self-doubt

I remembered having the conversation with the head of the incubator two years into the project during my visit back to Singapore when he tried to convince me the project was going nowhere and I should just throw in the towel. I managed to convince him and more importantly myself to give it go for another 6 months till the end of the year.

I remember the evenings and weekends alone in my room while not working on my day job. In between spurts of coding, I would browse through the web or sit staring at the wall trying to envision how product market fit would look like. As what Steve Jobs mentioned once in his lecture, it felt like pushing against a wall with no signs of progress or movement whatever so. If anything, it was a lot of frustration, self-doubt and dejection. A few times, I felt like throwing in the towel and just giving up. For a period of 6 months in 2014, I actually stopped touching the code in total exasperation and just left the project running on auto-pilot, swearing to never look at it again.

The hiatus was not to last long though. A calling is just like the siren, even if somewhat faint sometimes, it calls out to you in the depths of night or when just strolling along on the serene beaches of California. It was not long before I was back on my MacBook plowing through the project again with renewed vigor.

First signs of life

It was mid-2015, the project was still not showing signs of any form of traction. I had by then stockpiled some cash from my day job and was starting to get interested in acquiring a piece of real estate with the hope of generating some cashflow to bootstrap the project while freeing up my own time. It was during this period of time that I got introduced to my friend’s room mate who also happened to be interested in real estate.

We started meeting on weekends and utilizing GetData.IO to gather real estate data for our real estate investment purposes. We were gonna perform machine learning for real estate. The scope of the project was really demanding. It was during this period of dog fooding that I started understanding how users would use GetData.IO. It was also then when I realized how shitty and unsuited the infrastructure was for the kind and scale of data harvesting required for projects like ours. It catalyzed a full rewrite of the infrastructure over the course of the next two years as well as brought the semantic query language to maturity.

Technical challenges

Similar to what Max Levchin mentioned in the book Founder’s at work, during this period of time there was always this fear in the back of my mind that I would encounter technical challenges which would be unsolvable.

The site would occasionally go down as we started scaling the volume of daily crawls. I would spend hours on the weekends digging through the logs to attempt at reproducing the error so as to understand the root cause. The operations was like a (data) pipeline, scaling one section of the pipeline without addressing further down sections would inevitably cause fissures and breakage. Some form of manual calculus in the head would always need to be performed to figure out the best configuration to balance the volume and the costs.

The number 1 hardest problem I had to tackle during this period of time was the problem of caching and storage. As the volume of data increase, storage cost increase and so did wait time required before data could be downloaded. This problem brought down the central database a few times.

After procrastinating for a while as the problem festered in mid-2016, I decided that it was to be the number 1 priority to be solved. I spend a good 4 months going to big-data and artificial intelligence MeetUps in the Bay Area to check out the types of solutions available for the problem faced. While no suitable solutions were found, the 4 months helped elicit corner cases to the problem which I did not previously thought of. I ended up building my own in-house solution.

Traction and Growth

An unforeseen side effect of solving the storage and caching problem was its effect on SEO. The effects on SEO would not be visible until mid-2017 when I started seeing increased volume of organic traffic to the site. As load times got reduced from more than a minute in some cases to less than 400 milliseconds seconds, the volume of pages indexed by bots would increase, accompanied by increase in volume of visitors and reduction in bounce rates.

Continued education

It was in early-2016 that I came across an article expounding the benefits of reading widely and deeply by Paul Graham which prompted me to pick up my hobby of reading again. A self-hack demonstrated to me by the same friend, who helped relocated me here to the Bay Area, which I pursued vehemently got me reading up to 1.5 books a week. These are books which I summarized on my personal blog for later reference. All the learnings developed my mental model of the world and greatly aided in the way I tackled the project.

Edmodo’s VP of engineering hammered in the importance of not boiling the ocean when attempting to solve a technical problem, of always being judicious with the use of resource during my time working as a tech-lead under his wing.  Another key lesson learned from him is that in some circumstances being liked and being effective do not go hand in hand. As the key decision maker, it is important to steadfastly practice the discipline of being effective.

Head of Design, Tim and Lukas helped me appreciate the significance of UX during my time working with them and how it ties to user psychology.

Edmodo’s CEO introduced us to mindfulness meditation late-2016 to help us weather through the turbulent times that was happening within the company then. It was rough. The practice which I have adopted till to date has helped keep my mind balance while navigating the uncertainties of the path I am treading.

Edmodo’s VP of product sent me for a course late-2017 which helped consolidate all the knowledge I have acquired till then into a coherent whole. The knowledge gained has helped greatly accelerated the progress of GetData.IO. During the same period, I was also introduced by him the Vipasanna mediation practice which coincidentally a large percentage of the management team practices.

One very significant paradigm shift I observed in myself during this period of continued education is the observed relationship between myself and the project. It has changed from an attitude of urgently needing to succeed at all cost to an attitude of open curiosity and fascination as one would an open ended science project.

Moving forward

To date, I have started working full time on the project again. GetData.IO has the support of more than 1,500 community members worldwide. Our mission is to turn the Web into the fully functional Giant Graph Database of Human Knowledge. Financially, with the help of our community members, the project is now self-sustaining. I feel grateful for all the support and lessons gained during this 6 year journey. I look forward to the journey ahead as I continue along my path.

Reflections on the decentralized multi-sided market place – GetData.IO

The beginning

The concept of GetData.IO was first conceived back in November 2012. I was rewriting one of my side project (ThingsToDoSingapore.com) in NodeJS back then. Part of the rewrite required that I wrote up two separate crawlers each for a different site which I was getting data for.

Very soon after I was done with the initial rewrite, I was once again compelled to write a third crawler when I wanted to buy some stocks on the Singapore stock exchange. I realized while the data for the shares were available on the site, they were not presented in a way that facilitated my decision making process. In addition to that, the other part of the data I needed were presented on a separate site and unsurprisingly not in the way I needed.

I was on my way to write my fourth crawler when it occurred to me, if I structured my code by cleanly decoupling the declaration from underlying implementation details, it is possible to achieve a high level of code re-use.

Two weekends of tinkering and frenzied coding later, I was able to complete the first draft of the Semantic Query Language and the engine that would interpret this query language. I was in love. Using just simple JSON, it allowed anybody the ability to declare the desired data from any parts of web. This includes data scattered across multiple pages on the same site or data scattered across multiple domains which could be joined using unique keywords.

The Journey

Five years have past since, during this time, I brought this project through an incubator in Singapore with my ex-co-founder, tore out and rewritten major parts of the code-base that did not scale well, banged my head countless times on the wall  in frustration due to problems with the code and with product market fit, watched a bunch of well-funded entrants came and went. To be honest, quite a few times I threw in the towel. Always, the love for this idea would call out to me and draw me back to it. I picked up the towel and continued ploughing.

It’s now June 2018. Though it has taken quite a while, I am now here in the Bay Area, the most suitable home for this project given to the density of technological startups in this region. My green card was finally approved last month. I have accumulated enough runway to allow my full attention on this project for the next 10 years. Its time to look forward.

The vision

The vision of this project is a multi-sided market place enabled by a Turing complete Semantic Query Language. The Semantic Query Language will be interpreted and executed upon by a fully decentralized data harvesting platform that will the capacity to gather data from more than 50% of the world’s websites on a daily basis.

Members can choose to participate in this data sharing community by playing one or more of the 4 roles:

  • Members who need data
  • Members who maintain the data declarations
  • Members’ who will run instances of the Semantic Query Language interpreter on their servers to mine for data
  • Member’s who sell their own proprietary data

From this vantage point, given its highly decentralized nature, it feels appropriate to deploy the use of block chains. The final part that needs to be sorted out prior to the deployment of blockchain to operate in full decentralized mode is figure out the “proof of work”.

Operations available in other database technologies will get ported over where appropriate as and when we encounter relevant use cases surfaced by our community members.

Why now and how is it important?

More as I dwell in this space, I see very clearly why it is only going to become increasingly important to have this piece of infrastructure in place. There are namely 3 reasons for this.

Leveling the playing field

The next phase of our computing will rely very heavily on machine learning. It is a very data intensive activity. Given that established data siren’s like Facebook, Google, Amazon and Microsoft have over the past years aggregated huge tons of data, this have given them a huge unfair advantage which might not necessarily be good for the eco-system. We need to level the playing field by making it possible for other startups to gain easy access to training data for their machine learning work.

Concerns about data ownership

GDPR is a cumulation of concerns of data ownership that has been building for the past 10 years. People will increasing want to establish ownership and control over their own data, independent of the data siren’s use to house them. This means a decentralized infrastructure which people can trust to manage their own data.

Increasing world-wide need for computing talents

Demand for engineering talent will only continue to increase as the pervasiveness of computing in our lives increase. The supply of engineering talents does not seem like it will be catching up and short fall is projected to continue widening till 2050. A good signal is the increasingly high premium paid to engineering talents in the form of salaries over the recent years. It’s just plain stupidity as a civilization to devote major portions of this precious engineering resource to the writing and rewriting of web crawlers for the same data sources over and over again. Their time should be freed up to do more important things.

The first inning

Based on historical observation, I believe we are on the cusp of the very first inning in this space. A good comparison to draw upon is the early days of online music streaming.

Napster versus the music publishers is similar to how the lay of the land was back 5 years ago when Craigslist was able to successfully sue 3Tap.

Last year, LinkedIn lost the law suit against folks who were scraping public data. This is a very momentous inflection point in this space. Even the government is starting to the conclusion that public data is essentially public and Data Siren’s like any of the big Tech should have no monopoly over data that essentially belongs to the users who generated them.

Drawing further upon on the music industry analogy, the future of this space should look like how Spotify and ITunes operate in the modern day online music scene

What about recumbents?

Further readings

Book Summary: The driver in the driverless car

Conditions the presage the leap into the future in any specific economic segment or type of service

  • Systemic requisite:
    • Widespread dissatisfaction – latent or overt with the status quo
  • Technology requisite:
    • Moore’s Law
      • Cheap computers
      • Cheap sensors – IOT
      • increase in Connection speed
      • Hand hosted AI
    • IOT
      • software
      • data connectivity
      • Handheld computing
    • Artificial Intelligence and Automation
      • Shift of discrete analog task into networked digital one

Five paradigms of computing

  • Electromechanical
  • Relay
  • Vaccuum tube
  • Discrete transistor
  • Integrated circuits – Moores’ Law

Current Concerns

  • Speed of technology evolution versus  speed of regulation – codified ethics
  • Equality, Risks and Dependency versus Autonomy
    • Does the technology have the potential to benefit everyone?
    • What are the risk and rewards?
    • Does the technology more strongly promote autonomy or dependence?
      • cheap software based technologies inexpensively scaled to reach millions-billions
      • the more revenue generated the more motivated developers would want to share it broadly

Future Concerns

  • Biometric theft
  • Merging of humans with computers
  • Extent of Gene alteration that is socially acceptable – new class of humans differentiated by genetic differences
    • mitigating health risk
    • higher intelligence
    • better looks
    • greater strength
  • Privacy will be a thing of the pass
  • Navigating technology trends as a navigator instead of a passenger
  • Large scale drone attacks

Artificial Intelligence

  • Definition: a cheap reliable industrial grade digital smartness running behind everything, Kevin Kelly, Editor of WIRED magazine
  •  Types
    • Narrow AI
    • Strong/General AI
      • Watson
  • Impact of existing human occupations
    • Doctors in health care
    • Lawyers

Education

  • Ancient Greece:
    • Socratic process whereby teacher guided students through the learning process by asking them questions
    • Education was privilege reserved for the elites
  • Middle Ages/Renaissance
    • Remained a priviledge
    • process of learning became more rote
    • more memorization
  • Online Education
    • Example: Khan academy
    • Researchers found people most likely to take advantage of online courses were those who need the least help
    • LA Unified: giving each student a tablet failed to move the needle
  • Minimally invasive Education, Mitra, New Delhi
    • NIIT building, Kalkaji slums
    • Key component of the learning process was the group dynamic
    • Self taught scholars learned as quick as school-bound peers
  • Self directed learning – flipped model of education
    • teacher no longer broadcast information, write lesson plans or stand in front of classes lecturing
    • teachers became coaches and guides to students needing additional help
    • students consumed recorded lectures or videos online at their own pace and in their own time
    • Teachers focus on judgment, nuances and emotional intelligence

Mores law and poverty

  • Comparatively poorer parts of the world will be able to leap frog into more modern and efficient era
    • wireless mobile phones
    • drones for deliver
    • Solar energy power plants
    • driver less cars
      • no need for traffic lights
      • freeways
      • Parking spaces
  • USA has no monopoly on innovation

Driver less Cars

  • Access versus ownership
  • Baidu, Google, Tesla
  • China
    • Bejing, Wuhu and Anhui
  • Singapore
  • city layouts become more flexible
  • commuting is less a hassle

Current trends

  • Plasma based water purification technology: kills 100% of bacteria and viruses
  • Energy

Further readings

  • How to create a mind: the secret of human thought revealed, Ray Krurzweil
  • The inevitable, Kevin Kelly
  • The internet of things: Mapping the value beyond Hype, McKinsey Global Institute
  • Infinite Resource: The Power of Ideas on Finite Planet
  • Abundance: The future is better than you think, Peter Diamandis