Venture Capitalist at Theory

2 minute read / Aug 26, 2024 /

# Higher Levels of Abstraction

Over the weekend, Andrej Karpathy shared this tweet & it inspired me to conduct the 2024 GTM Survey Analysis this way.

I use a language called R to analyze data because of its ability to generate pretty charts & the depth of its statistical analysis tools. Within 90 minutes I had found 9 key data points in the data that were statistically significant & runs more than 50 analyses. In the past this kind of work would have taken me 30 to 40 hours.

Programming is becoming prompting.

I used to write something like this to generate a chart :

`ggplot(data) + geom_bar(aes(x = variable, y = value), stat = "identity") + theme_minimal() + labs(title = "Title", caption = "Caption")`

Copilot autocompleted the different fields. But using Sonnet & Cursor, I first wrote “Perform a conjoined analysis, comparing the correlation across all variables within the data frame. Plot this on a bar chart using my particular theme, with an insightful title & a caption for Theory Ventures.”

Then I wrote “Run the same analysis for sales quota compared to company size.” Next, “how about NDR for company size?” Each time, the robot produced 150 lines of code in seconds.

More than just the code, I request a test for statistical significance. I remembered from statistics class in college to perform a t-test for comparing two means when the sample size is greater than 35. But I had forgotten how to compare the means across more than two groups. ANOVA to the rescue.

All of the code is formatted according to proper syntax & it works. The only errors I found concerned color palette specifications.

English is the new programming language. Coding this way, I explored the data much more deeply, more rigorously, & more quickly than I would have otherwise.

The user still needs to be aware of the underlying syntax to fix errors & some statistical tests to verify the computer is doing the right thing, but gone are the days of memorizing the functional arcana of individual programming libraries.

In other words, I’m operating at a higher level of abstraction. Though it may not seem this way, the user interface of data exploration has changed. It’s a back & forth with the computer, a conversation, a dialogue with ongoing output. I’m thinking about the next analysis, not the next functional argument.