Key insights for Jerry’s birthday party

On web scraping

  • There is only a total of 180 million registered domains in the world
  • There is a total of 120 billion web pages in the world
  • .IO is the most popular top level domain right now in the world
  • make sure to buy up “CompanyEntity”Suck.com domain as they tend to get a lot of traffic
  • Similarities comparison
    • pull out words on website’s about us page and comparing with other website’s about us page
    • reduce total number of dimensional space to compute similarities – sample space 500,000 word
    • K-Means as opposed to Cosine similarities is a cheaper approach
    • still need humans to tag specific data sets
  • Detecting structure in page
    • find the tables and row
    • extract the values within the column, check if is
      • place
      • person
      • address
    • tag all the rest of the content on the page as the same entity
  • Approaches
    • pure machine – inaccurate
    • pure human – not-scalable
    • set process to mix of machine and human for optimal configuration
  • On Distill Networks/Bots blocking
    • this company utilizes machine learning to detect for bots
    • currently only 10 of 1000 fortune 1000 companies are using their service
    • fat tail companies will have the resource and motivation to protect their publicly available data
    • long tail companies will have neither the resource nor motivation to protect their publicly available data

On enterprise sales

  • It takes 5 years to pass through the trough of sorrows after the initial hype. Enterprise companies come to trust you after you have been around long enough
  • Most enterprise companies do not have the capacity to plough through the volume of automated sales leads generated even if they want to. The main bottleneck is their sales team
  • Enterprise companies are willing to pay really high margins
  • Sample concepts
    • BuiltWith.com – used by hedge fund managers to track how well a software has gain traction amongst users based on javascript snippet

On Social networks

  • Facebook uses collaborative filtering
  • Most lucrative advertising audience are still North America Whites
  • African demographic don’t spend much which makes them really bad advertising targets but are really loyal users once acquired
  • African demographic drive much of the music and culture
  • To ensure optimize monetization spend as well as server resource, could use Facebook page like condition to filter for more lucrative demographics
  • Short video is the trend now
    • SnapChat is considered messaging than video
    • Instagram is in the space
    • Tic Tak is in the space
    • Differentiation is a challenge in this space
  • iMessage is the largest competitor of Facebook Messenger. The former spans across East and West.

On Venture Capital

  • Founders might potentially get blocked from selling company by investors past trough of sorrow stage (typically 5 years in)
  • Founders might want to exit while investors need to get their return multiples (4x minimal)
  • Investors might seek to replace CEO to bring in a growth/scaling CEO as opposed to a product centric CEO

On Mobile gaming

  • Each games has a life span of 5-6years
  • In app purchases is the main driver of revenue
  • Failure rate is very high
  • Assuming 4 experimental teams, the operations typically generates one successful mobile game per year.

Contributors

  • Jerry
  • Perry
  • Yi

Leave a Reply