“you absolutely have to view LLM benchmarks from a position of default-distrust” @ouguoc.mastodon.online.ap.brid.gy describes how easily answers to benchmark problems can leak into the training set. https://seinmastudios.com/posts/llm-benchmarks-are-not-trustworthy/
why is the link not a link? let's try that again: seinmastudios.com/posts/llm-be...
maybe in a US context, state governments could work? drafts.interfluidity.com/2025/03/09/v...
Loading quoted Bluesky post...
One thing the conservative movement has proven is burying The New York Times in flack and pushback can be quite effective.
Where I am now it's already the 5th of July, but at home it is still July 4. I am posting what I traditionally post every July 4. Optimism of the will. "On the stairs I smoke a cigarette alone / Mexican kids are shootin' fireworks below." God bless America. www.youtube.com/watch?v=K_ty...
Your piece is great. As usual. Yeah, I think there's no getting around that we can't measure human welfare without a set of values to define what that means. There's not some "scientific" trick or shortcut that lets us authoritatively, universally, say that this is better than that. 1/
Technocrats pretended that GDP was that for a while, and it kind of worked because, broad brush, the correlation was pretty strong between GDP per capita and qualitative, intuitive, perceptions of prosperity and satisfaction. 2/
It was a great fit for "neoliberal" economics, whose ideological trick was largely to obscure methodological problems economists had long discussed and pretend something like economics 101 provided a scientific basis for policy to which intelligent people must defer. 3/
Besides the helpful casual empirics, it had this story, for market economies, GDP is the quantity of the highest value (because market optimized) basket of goods and services produced by the economy, and so that quantity, a simple number, should be a pretty good proxy for wealth. 4/
But real-life markets are imperfect optimizers, and even theoretical markets are arguably local rather than global optimizers that might get stuck in bad path dependencies (like automobile-dependent low density living, I'd argue). 5/
This trick of replacing an infinitely dimensioned "what" with a single number "how much" just doesn't work. 6/
So we are left with judgment calls to make, which invariably involve both evaluating tradeoffs on dimensions we'd mostly all agree are valuable (say shipbuilding capacity vs health care services), but also require imposing contested values. 7/
Some people think the costs of low-density single-family infrastructure are totally worth it, resources spent in support of human flourishing. I think an auto-centric built environment is both costly and inferior in welfare terms to achievable alternatives. 8/
Each claim requires assertions about other people's preferences in a normative sense, what they should want (since it's indeterminate what they do want while the hypothetical choice, two very different social equilibria, are not remotely before them). 9/
There's no "scientifically" right or wrong answer other ppl must defer to. Only judgements—not pulled from nowhere, informed by marshaling evidence—but judgments nonetheless, of which we have to persuade our fellow citizens, rather than truths we can discover and propound as incontrovertible. /fin
an unsurprisingly great summary, big picture, of how and where we have cornered ourselves economically the past few decades, by @sjshancoxli.liberalcurrents.com.
Loading quoted Bluesky post...
tech barons take note. this is the future you preferred to one that included musings on an unlikely unrealized capital gains tax. this is the future you may have delivered to all of us, but from which you will not be exempt. ht @alanbeattie.bsky.social
Loading quoted Bluesky post...
“Positive-sum solutions to multi-period stag hunts select for time consistency more than cost effectiveness.” @akhilrao.bsky.social akhilrao.org/blog/2025/07... remarkable insights what leadership and coordination actually entail, in the dry language of game theory. except with death cults. 1/
project designs demanding “unnecessarily” expensive commitments, work to ensure partners highly committed to shared values, can augur success in ways that seem irrational, inefficient, dumb to reviewers who imagine a dictator with a Gantt chart can just get every participant to play their part. /fin
“The rise of Whatever” @eev.ee eev.ee/blog/2025/07... i’ve complicated, still mixed, views abt LLMs, whether what comes of them can be good despite much evident awfulness (“slop”). but the “whatever” thesis perfectly captures what happened to the web and crypto. and yeah, LLMs are whatever machines
i guess my lean towards (a) comes from a kind of qualitative empiricism, at best. year by year growth numbers aren’t reliable, but China’s share and increasing dominance in important sectors doesn’t require a well calibrated horserace to observe. 1/
that doesn’t mean China’s successes render its model superior in welfare terms! (there were lots of things the Soviet Union ill-advisedly produced a lot of, “dominated”). 2/
traditional growth measures are intended in a neoliberal context to obviate the question of “are we producing the right things?”, because markets optimize, then adding dollars spent on new production makes scalars, the value of “best uses” that we can rank. 3/
but i think rentierism in the US, persistent high mark-ups embedded in those numbers we sum, have weakened the historical correlation between GDP and welfare. 4/
so i think on both sides, our numbers aren’t what we want. they are overtly massaged on one side. they are undermined by structural change despite consistent, earnest, econometrics on the other. 5/
so ultimately we have little choice but to rely on course, qualitative observations, and impose our own weightings on them. 6/
does the military heft and/or generalized physical-world capacity that comes with shipbuilding overwhelm the cost in all the services thousands of workers in shipbuilding might otherwise supply? 7/
rule by law depends on it being a norm among those with the power to enforce laws and norms!
Thread by @sjshancoxli.liberalcurrents.com. (I lean towards (a), see drafts.interfluidity.com/2024/08/13/c... and the industrial policy pieces linked beneath it. but (b) could be right. it’s a continuing debate!) 👇
Loading quoted Bluesky post...
we are governed by laws or we aren’t. some small social media company should sue. will even this Supreme Court claim a President can not only refuse to enforce a law, but affirmatively impose a regime in defiance of it?
Loading quoted Bluesky post...
oh yes. where’s the unpleasant laxative that would drain our ugly mess?

