Why I built GetData.IO

When training neural networks, until enough training data is collected there will be a period when the output of the neural network is full of false positives and false negatives (aka junk)

The same could be said of the human brain (a biological neural network), unless you have access to another human brain whose output you can totally trust and rely on, there will be a prolonged period when you fumble around while struggling to gather enough data to build a mental model of the new domain.

Based on my experience, the main challenge when breaking into new domains is that no pre-trained neural networks exists. During such situations, expect a prolonged period of confusion and fumbling around. Persistence (aka brute force iteration) is probably the only thing you can fall back on.

Thankfully, totally new domains seldom exists. Whatever “new domain” you think you are trying to break into, someone else is probably either doing it right now or has already done it.

That is why I built GetData.IO. It is to help people who need data to make good decision quickly find the data they need as well as people who might already have a trained model.

Weapons of Math destruction by Cathy O’Neil

Weapons of Math Destruction

Key take aways

  • Human values like justice and mercy is hard if not impossible to encode as rules
  • Data scientist use proxies as an approximate gauge for the existence of values. These are inherently inaccurate if not downright wrong
  • While using of race as a feature to determine if loan should be approve is obviously racist, the use zip codes though not obvious is equally racist since race tends to segregate around geographical territories
  • Models are increasingly used to across various domains to help increase the speed of decision making. This increases the negative impact of badly designed models will have on humans
  • A feedback is necessary to ensure continuous correction of badly design models – transparency of how your credit score is calculated
  • Regulations are necessary on the use of models as companies driven by quarterly reporting requirements of shareholders are primarily be focused on the bottom line

 

General thoughts of training trading bot

  • regime change occurs on the average every 3 months and the model gets outdated.
  • early signs of outdated model includes consistent non-commit signals
  • initial changes to trading parameters will tend to yield poor initial outcomes.
  • Good outcomes will require time to play itself out
  • Buying on the MACD bullish reversal tends to be too late in a volatile market. Potential gains from the reversal would have most likely played out by then
  • Drastically reducing number of outstanding positions leads to inefficiency of capital deploy as capital is left idling around with a bullish trend plays itself out
  • Explore buying when negative MACD trend slows down.

The dichotomy between privacy and health

1984: Big Brother is Watching

Across multiple literature, its been stated privacy versus health will be one of the primary dichotomy societies around the world will need to juggle with as technological advances are made in the fields of artificial intelligence, communications (surveillance) and medical science (genetic research).
 
What is surprising was the rate at which the Corona pandemic catalyzed this change. In light of this, it is fascinating to observe how different societies position along the spectrum. Some societies has opted for surveillance to the maximum extend possible with current technology while others opted for its polar opposite going to the extend of staging mass protests against it use. 
 

Related readings:

  • The AI Economy, Roger Bootler
  • To Be a Machine, Mark O’Connell
  • Irrational Exuberance, Shiller, Robert J.
  • Life 3.0: Being Human in the Age of Artificial Intelligence, Max Tegmark
  • Mind Children The Future of Robot, Hans Moravec
  • The Singularity Is Near, Ray Kurzweil
  • 1984, George Orwell

The AI economy, Roger Bootle

Paradoxes

  • Polanyi Paradox
  • Moravec’s paradox

Key skill sets for the AI era

  • complex communication
  • Creativity
  • Strategic thinking / critical thinking
  • Empathy / humanity

Key themes

  • AI as labor cost versus AI as capital expenditure
  • Taxes on AI development versus edge in global competition
  • Labor versus leisure
  • Global positioning
  • Population size as advantage for big data

Book summary: to be a machine by Mark O’Connell

At some point given enough technological advances we might no longer be restricted to our human form. It make sense to seriously consider what it fundamentally means to be human.

The three phases of the transhumanism

  • Biological body, biological brain
  • Biological body, augmented brain
  • Augmented body, augmented brain

Related readings

  • Mind Child: the future of robot and human intelligence, Hans Moravec

Bloomberg does AB testing

Fascinating AB testing observed on the entire world’s population by major news networks. 

Barely 72 hours after the announcement of phase 1 trade deal, its accompanying mass euphoria and surge in world markets, the almost same exact photo with some slight changes in copyrighting and background color is released into production. 

It will be fascinating to observe the world’s reaction to this new AB year variant that just got released and the corresponding market price levels. 

Key insights from mooncake festival at house of Jerry and Liza

Technology trends

  • Companies are increasing shifting their service from one-off on premise licensing deployment monetization to cloud based SaaS recurring subscription models
    • revenue hit in the short run
    • increased customer LTV in the long run
    • affected publicly traded companies will experience short term discounts to their shares
  • Artificial intelligence versus Augmented intelligence
    • companies are increasingly shifting away from automatic insight generation to systems that help decision makers simulate and model potential outcomes when specific policies are executed
    • demand is shifting from insight generation to data cleaning services
  • Corporate adoption of artificial intelligence
    • CEOs are increasingly considering how to leverage AI as a tool for their trade
    • primary use case is figuring out how to increase their sales volume
    • experiencing challenge on how to apply AI on in-house data to achieve monetization goals
  • Rise of deep vertical data networks
    • EverString – provides sales lead refresh for all client companies ends up becoming a large database for decision-making executives information, approximately 6 million records
    • StreetSine.sg – cleans up real estate data to help agents better price houses for sale by utilizing in-house agency ends up becoming a large database of high quality real estate data
  • Crypto-currency
    • Bit coin is still the main poster child
    • general population still skeptical about libra
    • main argument is still to remove central bank controls
    • main adoption hurdles
      • writing throughput volume
      • a stable store of wealth
      • starting to be using as a means to facilitate transaction in China
      • Inability to increase or decrease currency supply in times of need is going to be hard as a means to provide much needed stimulation during economic recessions and inflations

US/China trade war

  • sources of conflict
    • technology theft
    • forced technology transfer
    • unfair trade practices like subsidized state owned Chinese companies operating in the export markets
  •  economy
    • China is experiencing inflationary deleveraging
      • local farmers are not growing critical food sources
      • critical food supplies are imported
      • price of imported goods are denominated in US reserve currency
      • shifting of supply chain out of China to
        • Vietnam
        • India
        • Taiwan
      • capital flight
        • Li Ka Shing moved funds out from Hong Kong in 2013 to Europe
        • raising funds for US Venture capital from China was easy prior to Chinese and US government shut down
    • US is experiencing deflationary deleveraging
      • businesses are concerned about macro environment and are reducing fixed investments
      • manufacturing is slowing due to decreased demand both locally and overseas
      • consumer spending and confidence is still strong
  • Chinese domestic concerns
    • Potential US meddling in Chinese domestic affairs – Hong Kong’s demonstration and demands
      • Revoking of National Education
      • Revoking of extradition bill
      • Resignations of HK Chief executive
      • universal suffrage: freedom to elect their own leaders
    • destabilized situation presents a challenge for Xi JinPing’s party to retain control of power over former Jian Zemin’s faction
  • value system
    • US is a highly rule based system
    • China’s system of control is highly subjective to the individual in power.  Direct government intervention in the distribution of wealth is a major source of concern

US/Mexico and world issues

  • NAFTA agreement was too one side and failed to take into account large  imbalance between the two economies
  • US’s arrangement of allowing Mexican tax payers the right to claim dependents ultimately resulted into tax claims and refunds for entire extended families in Mexico. This has the effect of subsidizing Mexican’s at the expense of Americans living along the rust belts
  • Its observed income inequality is becoming prevalent across the entire world not just within US and China.

Related readings

Insights from evening with Wine and Jam session with Adriene and Dennis, Birthday party with Konstantine

From Tommaso on Machine Learning – Economics

  • Study was done on Italy
  • Voting patterns can be leading indicators for credit spread
  • When a political party is stable, Eigen distance between votes of party members will cluster together, even for bills that are not critical for parties
  • When a political party becomes unstable, the Eigen distance between vote on non-critical bills by party members will increase
  • Alignment will increase before a wild swing to misalignment
    • periods of high alignment leads to very tight credit spread
    • tight credit spread indicates a very high price for bond
  • Misalignment will decrease before consolidation towards alignment
    • periods of low alignment leads to very wide credit spread
    • wide credit spread indicates a very low price for bond
  • proposed strategy:
    • when alignments increase, short bonds in anticipation for forthcoming misalignment
    • when misalignments increase, long bonds in anticipation for forthcoming alignment

From Tommaso on Machine Learning – Micro-biology

  • Started studying how presence specific bacteria affects health at an aggregate level
  • Studying ancestral tree of bacterias help estimate the distribution of bacteria in the gut of different individuals

On sales

  • Mormons are one of the best sales people due to their coming to age ritual.
  • They typically will get rejected many many times during their passing through rite

On human cyborgs

  • Neil Harbisson is an artist who is color blind, he implanted a device into his brain which allows him to see colors

On hardware and biotech with Andrew

  • To design printed circuit board install IDE which allows easy assembly, simulation and coding – DesignSpark
  • Firmware for micro-processors are written in C. Firmware controls where signals are past to when incoming signals are received
  • Micro-processors are installed on circuit board
  • Can be printed in China and shipped over in 5 days – PCBWay
  • BioCurious and another biotech hacker space up in Berkeley have managed to train yeast to product cocaine and THC

Related readings