

A data center that will employ like 100 people and destroy their local air quality (probably), blast tons of light at night (definitely) and otherwise destroy the local ecology.


A data center that will employ like 100 people and destroy their local air quality (probably), blast tons of light at night (definitely) and otherwise destroy the local ecology.


If a company adds a tariff charge to its bill when a consumer pays, or it can be shown that they passed along a charge by raising the price by the tariff amount or similar, then it is clear the consumer paid. The party paying the tariff is the consumer. Suing for a refund sets precedent, and how the current round of tariffs were deemed illegal in the first place.


I guarantee they’re already navigating an AI hellscape. The problem are not insurance workers or working class wage workers, it is the system that is designed in such a way that instead of these folks facilitating actual care, which would be good and right (let’s catch fraud and such, and also make sure we have efficient claims services even with single payer so that treatment is even more cost effective). Better we have solidarity and convince the workers of these companies that they could still have jobs in public sector with equal pay and better benefits with a single payer system.
There are vanishingly few people getting mega wealthy off of insurance in the US. It’s not the wage workers. It’s the wealthy class siphoning our money and stealing while we die from preventable diseases.


Traditional software was developed by humans as an artifact that, and to the degree that humans improved the software for some task, got better, but it was not guaranteed. Windows 11 is proof of that, and there are a laundry list of regressions and bugs introduced into software developed by humans. I acknowledge you say usually and especially for open source — I lukewarm agree with that statement but disagree that large LLMs or other generative models will follow this trend, and merely want to point out that software usually introduces bugs as it’s developed, which are hopefully fixed by people who can reason over the code.
Which brings us to AI models, and really they should just be called transformer models; they are statistical tensor product machines. They are not software in a traditional sense. They are trained to match their training input in a statistical sense. If the input data is corrupted, the model will actually get worse over time, not better. If the data is biased, it will get worse over time, not better. With the amount of slop generated on the web, it is extraordinarily hard to denoise and decide what’s good data and what’s bad data that shouldn’t be used for training. Which means the scaling we’ve seen with increased data will not necessarily hold. And there’s not a clear indication that scaling the model size, which is largely already impractical, is having some synergistic or emergent effect as hoped and hyped.
Also, we’re really not in the infancy of AI. Maybe the infancy of widespread hype for it, but the idea of using tensor products for statistical learning algorithms goes back at least as far as Smolensky, maybe before, and that was what, 1990?
We are in the infancy of I’d say quantum style compute, so we really don’t have much to draw on beyond theoretical models.
Generative LLM models have largely plateaued in my opinion.
In my experience it is obvious. Calling people on it also makes them feel embarrassed usually. I put something like “I can just ask an LLM myself if I wanted this output. Please provide your own commentary.” If I were a manager and I had an employee just copy pasting that kind of output, I’d probably wonder if that employee actually contributes anything.
I think this is the way. A certain number of times of “[coworker] wasn’t asked because they only respond with LLMs, so I just ask the LLMs directly. I am not sure what [coworker]’s expertise is anymore, I just don’t consult them” and I suspect coworker may in fact stop responding with LLMs.


This already happens intrinsically in the models. The tokens are abstracted in the internal layers and only translated in the output layer back to next token prediction. Training visual models is slightly different because you’re not outputting tokens but pixel values (or possibly bounding boxes or edges, but not usually; conversely if not generative you may be predicting labels which could theoretically be in token space).
The field itself is actually fairly stagnant in architecture. It’s still just attention layers all the way down. It’s just adding more context length and more layers and wider layers while training on more data. I personally think this approach will never achieve AGI or anything like it. It will get better at perfectly reciting its training data, but I don’t expect truly emergent phenomena to occur with these architectures just because they’re very big. They’ll be decent chatbots, but we already have that, and they’ll just consumer ever more resources for vanishingly small improvements (and won’t functionally improve any true logical capability beyond regurgitating logical paths already trodden in their training data but in a very brittle way, because they do not actually understand the logic or why the logic is valid, they have no true state model of objects which are described in the token space they’re traversing probabilistically).


Sorry, I’m not saying that’s a good thing. It’s not just the context that’s expanding, but the parameter of the base model. I’m saying at some point you just have saved a compressed version of the majority of the content (we’re already kind of there) and you’d be able to decompress it even more losslessly. This doesn’t make it more useful for anything other than recreating copyrighted works.


Current models are speculated at 700 billion parameters plus. At 32 bit precision (half float), that’s 2.8TB of RAM per model, or about 10 of these units. There are ways to lower it, but if you’re trying to run full precision (say for training) you’d use over 2x this, something like maybe 4x depending on how you store gradients and updates, and then running full precision I’d reckon at 32bit probably. Possible I suppose they train at 32bit but I’d be kind of surprised.
Edit: Also, they don’t release it anymore but some folks think newer models are like 1.5 trillion parameters. So figure around 2-3x that number above for newer models. The only real strategy for these guys is bigger. I think it’s dumb, and the returns are diminishing rapidly, but you got to sell the investors. If reciting nearly whole works verbatim is easy now, it’s going to be exact if they keep going. They’ll approach parameter spaces that can just straight up save things into their parameter spaces.


Thank you! This is really good info. I’ll take a look!


This is good to know. Can you provide a link to that court case or anything?


Valve states you can’t sell a steam key in another platform for cheaper than in steam, not that you can’t sell your game anywhere else at a lower price. That’s slightly different than here. Not defending it just saying that it is actually different than here.


Looking through, it seems like for the most part these are very niche and/or require the user to be using SSO or enterprise recovery options and/or try to change and rotate keys or resync often. I think few people using this for personal would be interacting with that attack surface or accepting organizational invites, but it is serious for organizations (probably why they’re trying quickly to address this).
Honestly I think a server being incognito controlled and undetected in bitwardens fleet while also performing these attacks is, unlikely? Certainly less likely than passwords being stolen from individual site hacks or probably even banks. Like at that point, it would just be easier to do these types of manipulations directly on bank accounts or crypto wallets or email accounts than here, but then again, if you crack a wallet like this you get theoretically all the goodies to those too I suppose, for a possibly short time (assuming the user wasn’t using 2FA that wasn’t email based as well).
Not to mitigate these issues. They need to fix them, just trying to ascertain how severe and if individual users should have much cause for concern.


That’s where rent control and tax breaks come in, as well as scale if you’re opening multiple stores. I’m not aware of a large non profit grocery chain.


One thing to also note, is that grocery margins are tight after they give shareholders and c level staff hundreds of millions of dollars. In other words, if you’re not intent on making a profit and distributing that profit only to billionaires and friends, there’s plenty of space to subsidize costs even before accounting for tax breaks (which probably net even) and controlled rent factors.
People shouldn’t get wealthy off selling groceries while people go hungry.


I kinda hope some people join to sabotage from within


You can get by surprisingly well on 20b parameter models using a Mac with decent ram or even 8b parameter models that fit on most high end (eg 16gb) model cards. Depends on your use cases but I almost exclusively use smaller local models.


This is amazing. Very interested in picking one up


It looks and feels a bit like windows with the theming it has out of the box. So it’s probably an easier on ramp and possibly recommended in “what Linux is most like windows” google searches and the like.
I think we’re in the “how long for Taiwan to develop a nuke” waiting room right now.