If the issue of aging population is an inevitable affliction of all industrialized countries and majority of countries will become industrialized within the next 30 years, then we should be expecting our population to collapse by 2050. Based on this premise rather than being worried that majority of workers will get replaced by Robots and made irrelevant, we should instead be worried that robots are not replacing tasks handled by forthcoming retirees fast enough,
Risk versus uncertainty
- Risk can be mathematically modeled to yield a probability
- uncertainty cannot be mathematically modeled
Conditions for quality data
Why google’s Search data is better than Facebook profile data
- subject feels she has privacy privacy
- subject feels she is not judged
- subject sees tangible benefit from being honest
The hedgehog versus the fox
- The hedgehog approaches reality through a narrative/ideology while the fox thinks in terms of probabilities
- The hedgehog goes very deep in an area while the fox employs multiple different models
- The fox is a better forecaster than the hedgehog
- The fox is more tolerant of uncertainty
- More data does not yield better results and predictions
- Deciding the right kind of data from the abundance available
- To do prediction it is important to start from intuition and to keep model simple
- qualitative data should be weighted and considered
- Be self aware of your own biases
- Similarity scores – clustering in Netflix and baseball
- Be wary of confirmation biases
- Be wary of overfitting using small sample size – Tokyo earthquakes and global warming
- Correlation does not equal causation
- short hand heuristics to reduce the computational space – for example chess
- Irrational exuberance, Robert Shiller
- Expert political judgement, Philip E. Tetlock
- Future shock, Alvin and Heidi Toffler
- Principles of forecasting, J Scott Armstrong
- Predicting the unpredictable, Hough
Grepsr is increasingly being used in the work place by Quid.
Business people that don’t know how to code use Grepsr to pull data.
There is increasing demand for DIFFs to identify thematic trends. Themes are extracted from articles through the use of NLTK.
The proliferation of machine learning libraries and the maturing of the semantic web is democratizing the access to insights.
The legalization of online sports betting has open a fertile ground for this trend towards democratization.
NBA basketball predictive modeling should be done at the players level instead of the team level as the data becomes too lossy.
The odds of sports books at the opening lines is to encourage even bets on both side. The odds of the closing lines is a weighted average of bets (signals) from the crowd.
Conversations with Yi (EverString)
The forthcoming trend for engineering
Machine learning is increasingly becoming commoditized. DevOps becomes more important. Demand for specialized service where DevOps is encapsulated will further increase as demand for engineering tasks further outstrips engineering supplies.
On lead generation market
Companies in the lead generation space have need for scalable web crawlers. This helps offset the cost of retaining three in-house engineers.
Lead generation space has consolidated. There were priorly 120k such companies. There is 7k companies in operation. Majority of players are generating leads by scraping LinkedIn.
Consumer space require constant development of new features. Enterprise space requires service heavy. Enterprise space requires not just lead generation but entire channel marketing service suit (physical mail, online advertising, email marketing)
Lead gen hard to retain. The list becomes less valuable once it’s been used. 80% yearly churn is normal. One company reduces yearly churn to just 10% this by reducing second year subscription from USD800/yr to USD200/yr. further discount to USD100/yr if they don’t like. Recurring service is for grabbing fresh leads from same data source.
On Tele conference
Zoom’s product team compared with UberConference has developed a better understanding of the true conference needs of their users in various context. They have worked harder to ensure their product work seamlessly in identified scenarios. A typical example is the ability to join s conference bybthe press of a button on their mobile phone while driving instead of having to type the typical 4 pin digits.
Building and optimizing the entire infrastructure (hardware and software) from ground up with autonomous self driving as the mission
Mission and decision making
Design decisions are made with trade off between functionality and cost to achieve the mission while keeping cost in control
- Lidar is not useful when cameras are available
- driving cars with HD mapping makes the entire operation brittle since actual road conditions can change
- Data Team
- Hardware Team
- Software Team
The Data model
- Cars on roads are constantly collecting new data
- New data is being utilized to train and improve neural network model
- New improved model is constantly being deployed back to the car to improve self driving
- Real world data provides visibility into long tail scenarios that simulated data cannot. Simulating long tail scenario is an intractable problem
- Balancing between data model and software
- Neural network is suitable for problems that are hard to solve by defining functions / heuristics
- Simple heuristics are better handled through coding in software
Future revenue model
Robo-taxi that will disrupt the ride-sharing space.
- Consumer car – USD0.60 / mile
- Ride sharing – USD 2-3 / mile
- Telsa Network – USD 0.18 / mile
- Legal – need more data and processing time to get approved
- Battery capacity
- Social norms around robo taxi