The AI discussion thread

[DHT]Osiris · Jan 24, 2025

Muse said:
So, you assume that whatever a PhD in astrophysics is true. No, that wasn't you, but I have to think that since physics is in chaos astrophysics can't be a settled realm at this point. Astrophysics uses the laws of physics. Since those are all in doubt, so is astrophysics.

Eh... macrophysics and microphysics are two different realms. We can calculate the age, size, density, energy, distribution, and makeup of the universe, galactic clusters, galaxies, star systems, and stellar bodies within a few percentage points. They all work within a realm of physics that is relatively simple and quite well understood.

Subatomic physics is another realm entirely, and is frankly voodoo bullshit.

Kaido · Jan 25, 2025

Kaido · Jan 26, 2025

YAS FINALLY

ChatGPT can now handle reminders and to-dos

ChatGPT just got a new automation feature for future tasks.

www.theverge.com

This is now in Advanced Voice Mode for iOS. I have it set to give me a gentle reminder to stay focused every few minutes LOL.

jpiniero · Jan 27, 2025

Oh nice, there's a China ripoff of ChatCPT called DeepSeek... and it's now #1 on the App Store. Just in time for the TikTok ban.

IBMJunkman · Jan 27, 2025

Ask DeepSeek about Tiamannen Sq.

biostud · Jan 27, 2025

IBMJunkman said:
Ask DeepSeek about Tiamannen Sq.

In three weeks ask chatGPT about J6 😛

IronWing · Jan 27, 2025

Any independent verification that DeepSeek is a ChatCPT killer?

KMFJD · Jan 27, 2025

wired article, not in depth though

How Chinese AI Startup DeepSeek Made a Model that Rivals OpenAI

When Chinese quant hedge fund founder Liang Wenfeng went into AI research, he took 10,000 Nvidia chips and assembled a team of young, ambitious talent. Two years later, DeepSeek exploded on the scene.

www.wired.com

DeepSeek had to come up with more efficient methods to train its models. “They optimized their model architecture using a battery of engineering tricks—custom communication schemes between chips, reducing the size of fields to save memory, and innovative use of the mix-of-models approach,” says Wendy Chang, a software engineer turned policy analyst at the Mercator Institute for China Studies. “Many of these approaches aren’t new ideas, but combining them successfully to produce a cutting-edge model is a remarkable feat.”

DeepSeek has also made significant progress on Multi-head Latent Attention (MLA) and Mixture-of-Experts, two technical designs that make DeepSeek models more cost-effective by requiring fewer computing resources to train. In fact, DeepSeek's latest model is so efficient that it required one-tenth the computing power of Meta's comparable Llama 3.1 model to train, according to the research institution Epoch AI.

quikah · Jan 27, 2025

DeepSeek is interesting. if all their claims pan out this could definitely kill OpenAI. The capital intensive AI models of OpenAI are not sustainable IMO. Will have to see how OpenAI responds. DeepSeek is opensource, so possibly OpenAI could use these innovations to improve their own models.

misuspita · Jan 27, 2025

Very interesting and prompted the biggest loss of value of a company ever, nVidia....

https://www.reuters.com/technology/chinas-deepseek-sets-off-ai-market-rout-2025-01-27/

Nvidia Stock Plunges 17% As NVDA Suffers Biggest Market Cap Loss Ever—Driven By DeepSeek

Nvidia lost more market value Monday than the total valuations of American stalwarts UnitedHealth, Costco and Bank of America.

www.forbes.com

Even though China's Deepseek is of course biased against Tianmen and Taiwan, you can also see the same kind of bias in ChatGPT and western AI towards other matters

Fanatical Meat · Jan 27, 2025

misuspita said:
Very interesting and prompted the biggest loss of value of a company ever, nVidia....

https://www.reuters.com/technology/chinas-deepseek-sets-off-ai-market-rout-2025-01-27/

Nvidia Stock Plunges 17% As NVDA Suffers Biggest Market Cap Loss Ever—Driven By DeepSeek

Nvidia lost more market value Monday than the total valuations of American stalwarts UnitedHealth, Costco and Bank of America.

www.forbes.com

Even though China's Deepseek is of course biased against Tianmen and Taiwan, you can also see the same kind of bias in ChatGPT and western AI towards other matters

Such as?

biostud · Jan 27, 2025

misuspita said:
Very interesting and prompted the biggest loss of value of a company ever, nVidia....

https://www.reuters.com/technology/chinas-deepseek-sets-off-ai-market-rout-2025-01-27/

Nvidia Stock Plunges 17% As NVDA Suffers Biggest Market Cap Loss Ever—Driven By DeepSeek

Nvidia lost more market value Monday than the total valuations of American stalwarts UnitedHealth, Costco and Bank of America.

www.forbes.com

Even though China's Deepseek is of course biased against Tianmen and Taiwan, you can also see the same kind of bias in ChatGPT and western AI towards other matters

But isn't that just the data it is trained on? If the model is open source then anyone can use it to train on other data and build a less China biased model?

misuspita · Jan 27, 2025

Fanatical Meat said:
Such as?

I admit knowing of those things indirectly from reading reddit, but googling chatgpt censoring gives enough results. Problem is its evolving and things it didn't censored before now are. Breastfeeding, political things, etc. The new Trump dynasty may even promt some changes to the responses allowed towards him, it's family, actions. I wouldn't put it past them especially since he loves authoritarianism so much

biostud · Jan 27, 2025

misuspita said:
I admit knowing of those things indirectly from reading reddit, but googling chatgpt censoring gives enough results. Problem is its evolving and things it didn't censored before now are. Breastfeeding, political things, etc. The new Trump dynasty may even promt some changes to the responses allowed towards him, it's family, actions. I wouldn't put it past them especially since he loves authoritarianism so much

Like J6

biostud · Jan 27, 2025

[DHT]Osiris · Jan 27, 2025

biostud said:
View attachment 115734

No current publicly accessible AI models (as far as I'm aware of) have access to live data. They're all working off a dataset that's x months or years old. It might be able to tell you the current day or maybe weather if it has hooks in it, but ask it what the most recent sub variant of COVID is and it'll give you a rough idea of how old the data it was trained on is.

KMFJD · Jan 27, 2025

biostud said:
View attachment 115734

when i tried that question it said that as of 2023 it was biden and that it's data set didn't have the current info

igor_kavinski · Jan 27, 2025

Not one of these big AI companies wants their AI hive finding out how effed up things are in real time 😀

Fanatical Meat · Jan 27, 2025

Are any of the AI video generators not complete shit? I totally understand it’s about the prompts they all do multiple things wrong such as:
Wildly vary based upon the prompt
Need multiple requests to get it moderately good
Conceal their pricing
Hide their pricing as in how many images/videos can be made with 20 credits
Are constantly “busy” during the free trial
Need apps to function and those apps tend to be made by someone else and their review scores swing quite a lot.

kt · Jan 27, 2025

IronWing said:
Any independent verification that DeepSeek is a ChatCPT killer?

Why can't you just take the words of a Chinese company at face value? /s

RnR_au · Jan 27, 2025

biostud said:
But isn't that just the data it is trained on? If the model is open source then anyone can use it to train on other data and build a less China biased model?

The model weights are open and free. The model arch is open and free so open source back ends can and have implemented the arch, so the model weights can be run on anyone's hardware.

The training data is not free and open. You need trillions of tokens to process during training on 10's of thousands of gpu's over periods of weeks to months.

You can fine tune the publicly available weights to remove any blindspots or censorships. This is done commonly enough since some folks like to run AI's locally that are good at generating smut reading material according to the owners fetishes 🙂

KMFJD · Jan 27, 2025

Sam Altman says he’s losing money on OpenAI’s $200-per-month subscriptions: ‘People use it much more than we expected’ | Fortune

The company has projected a loss of $5 billion on revenue of $3.7 billion for 2024.

fortune.com

lol, lmao

this is part of the reason isn't it?

https://stratechery.com/2025/deepseek-faq/

more detailed

Here’s the thing: a huge number of the innovations I explained above are about overcoming the lack of memory bandwidth implied in using H800s instead of H100s. Moreover, if you actually did the math on the previous question, you would realize that DeepSeek actually had an excess of computing; that’s because DeepSeek actually programmed 20 of the 132 processing units on each H800 specifically to manage cross-chip communications. This is actually impossible to do in CUDA. DeepSeek engineers had to drop down to PTX, a low-level instruction set for Nvidia GPUs that is basically like assembly language. This is an insane level of optimization that only makes sense if you are using H800s.

Lastly, we emphasize again the economical training costs of DeepSeek-V3, summarized in Table 1, achieved through our optimized co-design of algorithms, frameworks, and hardware. During the pre-training stage, training DeepSeek-V3 on each trillion tokens requires only 180K H800 GPU hours, i.e., 3.7 days on our cluster with 2048 H800 GPUs. Consequently, our pre- training stage is completed in less than two months and costs 2664K GPU hours. Combined with 119K GPU hours for the context length extension and 5K GPU hours for post-training, DeepSeek-V3 costs only 2.788M GPU hours for its full training. Assuming the rental price of the H800 GPU is $2 per GPU hour, our total training costs amount to only $5.576M. Note that the aforementioned costs include only the official training of DeepSeek-V3, excluding the costs associated with prior research and ablation experiments on architectures, algorithms, or data.

IBMJunkman · Jan 27, 2025

Dave explains.

Kaido · Jan 28, 2025

quikah said:
DeepSeek is interesting. if all their claims pan out this could definitely kill OpenAI. The capital intensive AI models of OpenAI are not sustainable IMO. Will have to see how OpenAI responds. DeepSeek is opensource, so possibly OpenAI could use these innovations to improve their own models.

ChatGPT just lost its job to AI 😛

IBMJunkman · Jan 28, 2025

Just heard new name for DeepSeek. ChatCCP. 🙂

The AI discussion thread

Lifer

Elite Member & Kitchen Overlord

Elite Member & Kitchen Overlord

Lifer

Senior member

Lifer

No Lifer

Lifer

Diamond Member

Senior member

Lifer

Lifer

Senior member

Lifer

Lifer

Lifer

Lifer

Lifer

Lifer

Diamond Member

Diamond Member

Lifer

Senior member

Elite Member & Kitchen Overlord

Senior member