← Back to all posts

Key insights for Jerry's birthday party

On web scraping

- There is only a total of 180 million registered domains in the world
- There is a total of 120 billion web pages in the world
- .IO is the most popular top level domain right now in the world
- make sure to buy up "CompanyEntity"Suck.com domain as they tend to get a lot of traffic
- Similarities comparison

pull out words on website's about us page and comparing with other website's about us page
- reduce total number of dimensional space to compute similarities - sample space 500,000 word
- K-Means as opposed to Cosine similarities is a cheaper approach
- still need humans to tag specific data sets

- Detecting structure in page

find the tables and row
- extract the values within the column, check if is

place
- person
- address

- tag all the rest of the content on the page as the same entity

- Approaches

pure machine - inaccurate
- pure human - not-scalable
- set process to mix of machine and human for optimal configuration

- On Distill Networks/Bots blocking

this company utilizes machine learning to detect for bots
- currently only 10 of 1000 fortune 1000 companies are using their service
- fat tail companies will have the resource and motivation to protect their publicly available data
- long tail companies will have neither the resource nor motivation to protect their publicly available data

On enterprise sales

- It takes 5 years to pass through the trough of sorrows after the initial hype. Enterprise companies come to trust you after you have been around long enough
- Most enterprise companies do not have the capacity to plough through the volume of automated sales leads generated even if they want to. The main bottleneck is their sales team
- Enterprise companies are willing to pay really high margins
- Sample concepts

BuiltWith.com - used by hedge fund managers to track how well a software has gain traction amongst users based on javascript snippet

On Social networks

- Facebook uses collaborative filtering
- Most lucrative advertising audience are still North America Whites
- African demographic don't spend much which makes them really bad advertising targets but are really loyal users once acquired
- African demographic drive much of the music and culture
- To ensure optimize monetization spend as well as server resource, could use Facebook page like condition to filter for more lucrative demographics
- Short video is the trend now

SnapChat is considered messaging than video
- Instagram is in the space
- Tic Tak is in the space
- Differentiation is a challenge in this space

- iMessage is the largest competitor of Facebook Messenger. The former spans across East and West.

On Venture Capital

- Founders might potentially get blocked from selling company by investors past trough of sorrow stage (typically 5 years in)
- Founders might want to exit while investors need to get their return multiples (4x minimal)
- Investors might seek to replace CEO to bring in a growth/scaling CEO as opposed to a product centric CEO

On Mobile gaming

- Each games has a life span of 5-6years
- In app purchases is the main driver of revenue
- Failure rate is very high
- Assuming 4 experimental teams, the operations typically generates one successful mobile game per year.

Contributors

- Jerry
- Perry
- Yi