Discussion about this post

User's avatar
Michael Laccetti's avatar

This may be a framing gap, but I work with inference costs daily and don’t really see them trending downward in practice as models advance.

When you reference declining prices and a ~10×/year drop in inference cost, what layer of cost are you using? Token list prices, provider economics, or effective cost per task for users?

From an operational perspective, those seem to pull in different directions, especially as newer models require more reasoning and runtime. I’m interested in how you’re integrating that into your view.

1 more comment...

No posts

Ready for more?