2 minute read / Oct 29, 2024 /
My AI Rube Goldberg Machine
In yesterday’s post, I calculated the profitability of public software companies. To calculate these figures, I built a little Rube Goldberg machine.
I didn’t download the data into Excel. Instead, I complexified things by sending the analysis to 4 AIs to see if they would agree.
The inspiration : many companies have used Amazon’s Mechanical Turk to crowdsource tasks, & pick a consensus answer across three workers to improve accuracy.
Why not try this across 4 AI workers instead?
Prompt : “calculate the average net income margin and cash flow from ops margin from this data set” plus the data set. Note that CFOM isn’t a simple average but requires dividing cash flow from ops by revenue beforehand.
Model | NIM, % | CFOM, % |
---|---|---|
Claude | 4.99 | 27.31 |
Gemini | -9.29 | 16.2 |
Perplexity | -8.67 | 14.4 |
ChatGPT | - 9.29 | 1,433.01. / 14.9% |
My Analysis | -9.29 | 16.2 |
Gemini scored top marks for tabulating correctly on both columns. ChatGPT did well with NIM but “forgot” to complete the additional division step, which I corrected with a follow up, but still not the right figure. The other systems missed the mark altogether.
It would be a mistake to draw any broad conclusions from my little experiment.
But in this case, consensus doesn’t yet work as a strategy which means I still need to double check calculations myself.
At some point, AI will mechanize the illusory Mechanical Turk & I’ll restart my Rube Goldberg math machine with confidence.