Open AI shenanigans in Cerberas IPO. They have a deal to pay a minimum of $10.53/m tokens (under 100% 24/7 capacity utilization) that is more like $40-$50/m in real world. They charge $14/m

humanspiral@lemmy.ca · 2 days ago

Fortunately, it is unlikely the US will wage war on China because it is already so far behind. The US is directly collapsing, and it explains its political desperation, but the war is entirely the oligarchy against the people, and protecting their pillage. Certainly massive spending for war is hard to stop, but the appetite to lose yet another war is not going to go up after Iran.

humanspiral@lemmy.ca · edit-2 2 days ago

I made a math mistake. Theoretical minimum cost to openAI is $3.15/m ($3.30/m with electricity) tokens, as cerebras has fixed context windows per user, and codex spark allows 3.33 concurrent users per node. That is still $16.50/m optimistic (20% of theoretical capacity) cost for $14/m revenue.

I guess there is a market for very fast response tasks. OpenAI does have a routing system that charges a high cost per token, but gets most of the work done by their smaller/cheaper models behind the scenes.

But, this turns out not to be ultra stupid if OpenAI has the internal training/improvement token workload to completely saturate the datacenter for its own use. Cerebras does have a training advantage over nvidia. It’s immature software stack only applies to cutting edge inference techniques.

humanspiral@lemmy.ca · 2 days ago

Open AI shenanigans in Cerberas IPO. They have a deal to pay a minimum of $10.53/m tokens (under 100% 24/7 capacity utilization) that is more like $40-$50/m in real world. They charge $14/m

humanspiral@lemmy.ca · 3 days ago

Bitlocker was developed entirely inside MSFT. Upon further review, there is a chance that this is all somewhat normal behaviour. Part of MSFT safeOS to make it convenient to recover bitlocker access, and update windows.

humanspiral@lemmy.ca · 3 days ago

does bitlocker encrypt whole volume, or userdata folders? It’s a performance issue to encrypt anything that doesn’t need to be.

humanspiral@lemmy.ca · 3 days ago

100% certainty of backdoor. Is bitlocker developed outside of MSFT? Would seem to need MSFT cooperation to implement.

humanspiral@lemmy.ca · 4 days ago

9gw if run 24/7 (capacity utilization is actually low on average in US) is 551.88 twh/year. 1500x. Natural gas is not that much cleaner than coal from co2/ghg warming perspective.

humanspiral@lemmy.ca · 6 days ago

its 9gw of consumption. 19gw of total heat generation.

humanspiral@lemmy.ca · 6 days ago

17gw of heat is both under and over estimate.

3,600 of those industrial-scale generators to power Stratos

Caterpillar 2.5mw generators have maximum efficiency of 45%, and so 19gw is peak thermal power. that is roughly 26 hiroshimas per day.

It’s an over estimate because datacenter cpu/gpu capacity utilization is on average under 10%. It could still produce all that power for export to trap all that heat in a valley.

humanspiral@lemmy.ca · 8 days ago

The reason you can’t buy RAM anymore is that “projections” are 16gw+ of AI deployment in US this year requires 70% of RAM to be for AI. 5gw is a practical ceiling for projects currently in active development. NVIDIA not only is growing its undelivered inventory at huge rates ($30B latest), its customers have $150B in “Construction in process” inventory as they aren’t getting transformers and utility hookups to finish/power on their datacenters. The circular financing by NVIDIA is just forcing their customers to shift unused GPU inventory into their warehouses. It eventually leads to less new sales/manufacturing of their GPUs, and then hopefully, RAM price normalization.

humanspiral@lemmy.ca · 8 days ago

US (enterprise/kubarnetes) datacenters also average 5% gpu and 8% cpu utilization. Much of datacenter buildout is for “reserving” capacity, even if existing infrastructure could accomodate "spot rental"or “serverless” (let google/aws cram your work request into the machine of their choice) to get 6x+ more “tokens”.

humanspiral@lemmy.ca · 8 days ago

Google/Amazon/MSFT circular investments in the big LLMs are all entirely funded in compute credits. Even if they eventually escape profit velocity, it bleeds equity to the mega tech oligarchy, while padding their revenue/stock price growth.

humanspiral@lemmy.ca · 8 days ago

The key hurdle this year facing AI frenzy/bubble: Not much datacenters are actually being completed.

humanspiral@lemmy.ca · 9 days ago

decent performance on 6gb gpu without quantization: https://www.youtube.com/watch?v=8F_5pdcD3HY&t=9s

humanspiral@lemmy.ca · edit-2 9 days ago

qwen 3.6 is awesome, but 48-64gb is still real money these days. (though 32gb on dedicated separate machine is also more money). Sonnet 3.5 to opus 4.5 level benchmarks. and the online cost metrics for 27b and 35b are way off considering the overall usefulness of a 48-64gb machine (inclusive of gpu vram for 35b) which even in single, non batching, use could displace $5-$7/day of use.

Local costs are much lower than online costs in linked chart, but if online, there are better models

humanspiral@lemmy.ca · 9 days ago

Literally the most absurd offer in history of universe. You pay $150/month to have some extra noise and space taken away from you. While they install batteries in your home, it is for their use, and takes up more space. All you get is a phone app that can turn off lights in other rooms.

humanspiral@lemmy.ca · 10 days ago

Their production rate is still 4 per month. They did promise large production capacity this year.

humanspiral@lemmy.ca · 11 days ago

This is new, and somehow related to onshore wind farms. Courts have issued rulings that DoD BS about radar interference was BS related to offshore windfarms, (previously OK’d by DOD for not creating interference) essentially allowing them to resume. This would be even more BS, with no possible chance of passing court review, because DoD can’t make radar BS claim.

humanspiral@lemmy.ca · 11 days ago

Can’t you just vibe code your banking app? As long as I can make my balance higher, this will beat iphone rounded corners banking app.

humanspiral@lemmy.ca · 11 days ago

so paywalled, didn’t even show a byline.

This is fundamental mechanism behind subprime crisis., Channels to offload debt so that more debt can be issued.

humanspiral@lemmy.ca · 13 days ago

A lot of companies building AI datacenters, are laying off people to afford the gamble. Oracle, Meta the biggest ones. So there’s a real burning of core bridges effect to get into the bubble gold rush.

humanspiral@lemmy.ca · 19 days ago

what does github copilot do? ELI20ish

humanspiral@lemmy.ca · 2 months ago

Someone tell Trump to walk down 5th Avenue, like Iran's President just did.

humanspiral@lemmy.ca · 3 months ago

Nvidia results show path towards AI bubble pop. $45B+ increase ($95B total) in supply commitments at all time high RAM prices

humanspiral@lemmy.ca · edit-2 4 months ago

Anthropic CEO important, but evil, essay: The adolescence of technology.

humanspiral@lemmy.ca · 6 months ago

TIL: There is an open source "Alexa replacement" project

humanspiral@lemmy.ca · 6 months ago

More circular AI financing. Anthropic promises $30B spending on Nvidia and MSFT, while the 2 companies invest $15B at over double actual share value in Anthropic.

humanspiral@lemmy.ca · 7 months ago

Looking more like insider trading scam than AI bubble: Open AI raises commitments/partnerships to 36GW with new 10gw Broadcom deal

humanspiral@lemmy.ca · 1 year ago

The tariff plan -- Daily Show

humanspiral@lemmy.ca · 2 years ago

Trump will plunge U.S. somewhere 'between recession and depression': MSNBC analyst