• 0 Posts
  • 44 Comments
Joined 3 years ago
cake
Cake day: August 2nd, 2023

help-circle
  • How can someone support them in good faith? I’ll focus on China, but here are some reasons:

    For starters, I don’t believe that it’s possible to impose on a society from the outside to accept LGBTQ people. For example, making LGBTQ acceptance as a precondition on having good relations with China has literally 0% chance of improving life of LGBTQ people there. It’s more likely to backfire. On the other hand, having good relations, and allowing cultural exchange to happen naturally, can - and I think, over the last few decades before relations soured, has - improved LGBTQ acceptance there.

    Also, amongst superpowers, China has a relatively good track record when in comes to using military force. They have had conflicts with neighboring countries, but it’s nothing compared to the wars the US or Russia (and USSR) have fought.

    Finally (this one I don’t share, but I think it can be held in good faith), someone might not care about human rights all that much. For example, if you consider government-sponsored murders to be just the same as any other - not better, but also not worse - then even if you include Tienanmen Square and other murders by the government, the murder rate in China is still lower than most of the world.













  • lily33@lemm.eetoOpen Source@lemmy.mlProton's biased article on Deepseek
    link
    fedilink
    arrow-up
    4
    arrow-down
    1
    ·
    edit-2
    1 year ago

    What makes these consumer-oriented models different is that that rather than being trained on raw data, they are trained on synthetic data from pre-existing models. That’s what the “Qwen” or “Llama” parts mean in the name. The 7B model is trained on synthetic data produced by Qwen, so it is effectively a compressed version of Qen. However, neither Qwen nor Llama can “reason,” they do not have an internal monologue.

    You got that backwards. They’re other models - qwen or llama - fine-tuned on synthetic data generated by Deepseek-R1. Specifically, reasoning data, so that they can learn some of its reasoning ability.

    But the base model - and so the base capability there - is that of the corresponding qwen or llama model. Calling them “Deepseek-R1-something” doesn’t change what they fundamentally are, it’s just marketing.