Venture Capitalist at Theory

About / Categories / Subscribe / Twitter

2 minute read / Sep 9, 2024 /

The Rarity Shibboleth

Large language models are wonderful at ingesting large amounts of content & summarizing. Uploading an academic paper and I can pester it with an infinite list of questions & it will respond with equally infinite patience.

In comparing the two most recent Microsoft earnings calls, Claude highlighted:

Excellent analysis.

Benn Stancil described LLMs as great averagers of information. But what if I don’t want the average? If I seek the data point two standard deviations out? If I’m channeling my inner Anthony Bourdain & I’m seeking the fermented shark or cobra heart within my query?

At Google, we ranked web pages & ads with many signals. An engineer taught me that of them found rarity of a word across a set of documents.

For example, if there’s a document containing 10 instances of the word ghoti across a collection of documents, where the average document has 0 ghotis, that document is likely the best to answer a search about fish.1

I haven’t found a way to goad an LLM to produce the rare result. Why did one user decide to use a particular piece of software when 10 others did not?

Maybe I haven’t learned how to prompt an LLM well. Ideogram launched a feature called Magic Prompt that expands a basic prompt into richer instructions that elicit better results, marrying the language of a user & a computer.

A user might write : A boy with a dog in a park.

Magic Prompt replies : A heartwarming scene of a young boy playing with his happy, wagging dog in a lush, green park. The boy has a playful smile on his face while he tosses a bright yellow ball for his furry friend to fetch. The park is filled with beautiful flowers and trees, creating a serene atmosphere for this bonding moment between boy and dog.

Magic Prompt surprised me in three ways.

First, unlike Google search where terse queries tend to work very well, LLMs relish verbosity. Less Hemingway & more Faulkner.

Second, maybe there’s a prompt for rarity. If you know what it is, please share with the rarity shibboleth.

Third, I wonder what other prompt techniques are out there that have yet to be discovered to wrest the best from an LLM.


1 This technique is called TF-IDF. It means Term Frequency Inverse Document Frequency.


Read More:

The Challenge of the AI Demo