Can LLMs really get cheaper?
Two weeks ago, I asked Peter Welinder, VP of Product at OpenAI, this question.
Peter explained, "There are two ways for LLM inference to be cheaper:"
Energy prices drop.
We find an alternative to NVIDIA.
Then he added:
Energy prices will continue to rise as demand increases much faster than supply.
NVIDIA currently has a monopoly, so we'll see about that.
In traditional businesses, you achieve economies of scale by reducing the cost per unit. Later, you sell the items with some margin, which, at scale, covers fixed costs (factory, equipment, employees, rent, etc.).
Model training is a fixed cost; however, the issue lies in the variable cost of inference. Being bound by energy prices, hardware/infrastructure, and the growing size of models (meaning more compute to train [fixed cost] and more expensive inference [variable cost]), the only logical approach is to increase prices.
However, competition emerges, providing better models in different areas every 3–6 months, raising consumer expectations for quality. With almost zero switching costs and no brand loyalty (like with Coke/Pepsi), R&D spending is increasing even more, adding to the total cost.
Now we have:
Models that are expensive to train.
AI labs can't (yet) charge a premium for providing the model.
Energy prices are not decreasing.
Heavily invested AI labs need to generate revenue at some point (within 12–24 months).
Given these factors, I don’t see a way for prices to decrease, assuming market dynamics remain the same.
LLMs gained popularity through a UX innovation with ChatGPT. Now, it seems we need a financial innovation to sustain their growth.
What do you think?