The following is a summary and article by AI based on a transcript of the video "NVIDIA CEO Jensen Huang Keynote at COMPUTEX 2024". Due to the limitations of AI, please be careful to distinguish the correctness of the content.
00:05 | all right let's get |
---|---|
00:09 | started good take good |
00:12 | take oh |
00:15 | je okay you guys ready yeah yeah |
00:17 | everybody thinks we make |
00:22 | gpus I didn't video is so much more than |
00:26 | [Music] |
00:28 | that this whole key you know it's going |
00:30 | to be about that okay so we'll start at |
00:32 | the top examples of the use cases and |
00:34 | then seeing it in action that's kind of |
00:36 | the flow it's such a compelling story |
00:39 | I'm super nervous about this just got to |
00:41 | get in the Rhythm we're 2 weeks away |
00:43 | from it you guys can go really really |
00:45 | makeing great we should take a swing at |
00:47 | it yeah that's the plan we need to get |
00:49 | daily status on that animation can you |
00:51 | mute cuz I hear myself sorry what's the |
00:54 | drop date for all the videos it needs to |
00:56 | be done on the |
00:58 | 28th did you get all |
01:04 | that safe travels everybody super |
01:07 | excited to see everyone see you guys |
01:10 | soon okay bye |
01:13 | bye we're basically moving as fast as |
01:16 | the world can absorb technology so we've |
01:18 | got to Le problem ourselves |
01:21 | [Music] |
01:24 | isn't now the spine you just have to |
01:26 | figure out a way to make it pop you know |
01:28 | what I'm saying |
01:32 | yeah you know what I'm saying you want |
01:35 | to yeah that kind of |
01:39 | [Music] |
01:42 | [Applause] |
01:47 | [Music] |
01:53 | thing thank you I'm super late let's go |
02:02 | [Music] |
02:11 | W please welcome to the stage Nvidia |
02:14 | founder and CEO Jen w |
02:18 | [Music] |
02:18 | [Applause] |
02:20 | [Music] |
02:35 | T I am very happy to be |
02:38 | back thank you n for letting us use your |
02:42 | Stadium the last time I was |
02:46 | here I received a degree from |
02:53 | [Applause] |
02:58 | n and I I gave the Run don't walk |
03:03 | speech and today we have a lot to cover |
03:07 | so I cannot walk I must |
03:10 | run we have a lot to cover I have many |
03:13 | things to tell you I'm very happy to be |
03:16 | here in |
03:17 | Taiwan Taiwan is the home of our |
03:21 | treasured |
03:23 | Partners this is in |
03:25 | fact where everything Nvidia does begins |
03:30 | our partners and ourselves take it to |
03:33 | the |
03:34 | world |
03:36 | Taiwan and our |
03:38 | partnership has created the world's AI |
03:44 | infrastructure |
03:46 | today I want to talk to you about |
03:48 | several |
03:49 | things |
03:53 | one what is happening and the meaning of |
03:57 | the work that we do together what what |
03:59 | is generative |
04:01 | AI what is its impact on our industry |
04:05 | and on every |
04:08 | industry a blueprint for how we will go |
04:12 | forward and engage this incredible |
04:18 | opportunity and what's coming |
04:22 | next generative Ai and its impact our |
04:26 | blueprint and what comes next |
04:30 | these are really really exciting |
04:32 | times a |
04:34 | restart of our computer industry an |
04:38 | industry |
04:40 | that you have forged an industry that |
04:43 | you have |
04:44 | created and now you're |
04:47 | prepared for the next major |
04:49 | Journey but before we |
04:53 | start Nvidia lives at the |
04:57 | intersection of computer graphics |
05:02 | simulations and artificial |
05:05 | intelligence this is our |
05:07 | soul everything that I show you |
05:11 | today is |
05:13 | simulation it's math it's science it's |
05:17 | computer science it's amazing computer |
05:21 | architecture none of it's |
05:23 | animated and it's all |
05:26 | homemade this is invidious soul and we |
05:30 | put it all into this virtual world we |
05:32 | called Omniverse please enjoy |
05:48 | [Music] |
05:55 | [Music] |
06:12 | [Music] |
06:17 | oh |
06:20 | [Music] |
06:49 | [Music] |
07:06 | [Music] |
07:21 | go |
07:24 | [Music] |
07:35 | [Applause] |
07:47 | [Applause] |
07:54 | I want to speak to you in Chinese but I |
07:56 | have so much to tell you I have to think |
07:59 | too |
08:00 | hard to speak |
08:03 | Chinese so I have to speak to you in |
08:06 | English at the foundation of everything |
08:08 | that you saw was two fundamental |
08:12 | Technologies accelerated Computing and |
08:15 | artificial intelligence running inside |
08:18 | the |
08:21 | Omniverse those two technologies those |
08:25 | two fundamental forces of |
08:28 | computing are going to reshape the |
08:31 | computer |
08:32 | industry the computer |
08:35 | industry is now some 60 years |
08:38 | old in a lot of ways everything that we |
08:42 | do today was invented the year after my |
08:44 | birth in |
08:46 | 1964 the IBM system 360 introduced |
08:50 | central processing units general purpose |
08:53 | Computing the separation of hardware and |
08:57 | software through an operating system |
09:02 | multitasking IO subsystems |
09:05 | dma all kinds of technologies that we |
09:07 | use today architectural |
09:11 | compatibility backwards compatibility |
09:13 | family compatibility all of the things |
09:16 | that we know today about Computing |
09:18 | largely described in |
09:20 | 1964 of course the PC Revolution |
09:23 | democratized Computing and put it in the |
09:26 | hands and the houses of everybody and |
09:29 | then off in 2007 the iPhone introduced |
09:33 | mobile Computing and put the computer in |
09:35 | our |
09:36 | pocket ever since everything is |
09:38 | connected and running all the time |
09:41 | through the mobile |
09:42 | Cloud this last 60 years we saw |
09:47 | several just several not that many |
09:50 | actually two or three major technology |
09:55 | shifts two or |
09:57 | three tectonic shifts in Computing where |
10:02 | everything changed and we're about to |
10:04 | see that happen again there are two |
10:07 | fundamental things that are |
10:08 | happening the first is that the |
10:12 | processor the engine by which the |
10:14 | computer industry runs on the central |
10:16 | processing unit the performance scaling |
10:20 | has slowed tremendously and yet the |
10:24 | amount of computation we have to do is |
10:26 | still |
10:27 | doubling very quickly |
10:31 | exponentially if processing requirement |
10:34 | if the data that is that we need to |
10:36 | process continues to scale exponentially |
10:38 | but performance does not we will |
10:42 | experience computation inflation and in |
10:45 | fact we're seeing that right now as we |
10:47 | speak the amount of data center power |
10:49 | that's used all over the world is |
10:51 | growing quite |
10:53 | substantially the cost of computing is |
10:55 | growing we are seeing computation |
10:57 | inflation |
11:00 | this of course cannot |
11:02 | continue the data is going to continue |
11:05 | to increase |
11:07 | exponentially and CPU performance |
11:10 | scaling will never return there is a |
11:12 | better way for almost two decades now |
11:15 | we've been working on accelerated |
11:17 | Computing Cuda augments a CPU offloads |
11:22 | and accelerates the work that A |
11:25 | specialized processor can do much much |
11:27 | better in fact the performance is so |
11:30 | extraordinary that it is very clear now |
11:33 | as CPU scaling has slowed and event |
11:36 | substantially stopped we should |
11:39 | accelerate everything I predict that |
11:42 | every application that is processing |
11:45 | intensive will be accelerated and surely |
11:48 | every data center will be accelerated in |
11:50 | the near future now accelerated |
11:53 | Computing is very sensible it's very |
11:55 | common sense if you take a look at an |
11:58 | application and and here the 100t means |
12:01 | 100 units of time it could be 100 |
12:04 | seconds it could be 100 hours and in |
12:06 | many cases as you know we're now working |
12:08 | on artificial intelligence applications |
12:10 | that run for a 100 |
12:14 | days the one t is code that is requires |
12:19 | sequential processing where |
12:21 | single-threaded CPUs are really quite |
12:23 | essential operating systems control |
12:26 | logic really essential to have one |
12:29 | instruction executed after another |
12:31 | instruction however there are many |
12:34 | algorithms computer Graphics is one that |
12:36 | you can operate completely in parallel |
12:39 | computer Graphics image processing |
12:42 | physics simulations combinatorial |
12:45 | optimizations graph processing database |
12:48 | processing and of course the very famous |
12:51 | linear algebra of deep learning there |
12:54 | are many types of algorithms that are |
12:56 | very conducive to acceleration |
12:59 | through parallel processing so we |
13:02 | invented an architecture to to do |
13:05 | that by adding the GPU to the |
13:09 | CPU the specialized processor can take |
13:13 | something that takes a great deal of |
13:14 | time and accelerate it down to something |
13:17 | that is in incredibly fast and because |
13:20 | the two processors can work side by side |
13:23 | they're both autonomous and they're both |
13:24 | separate independent that is we could |
13:27 | accelerate what used to take a 100 units |
13:30 | of time down to one unit of time well |
13:33 | the speed up is incredible it almost |
13:36 | sounds |
13:38 | unbelievable it almost sounds |
13:40 | unbelievable but today I'll demonstrate |
13:42 | many examples for |
13:44 | you the benefit is quite extraordinary a |
13:47 | 100 times speed up but you only increase |
13:50 | the power by about a factor of three and |
13:54 | you increase the cost by only about |
13:57 | 50% we do this all the time in the PC |
14:00 | industry we add a GPU a $500 GPU gForce |
14:04 | GPU to a $1,000 PC and the performance |
14:07 | increases tremendously we do this in a |
14:10 | data center a billion dooll data center |
14:14 | we add $500 million worth of gpus and |
14:18 | all of a sudden it becomes an AI |
14:21 | Factory this is happening all over the |
14:24 | world today well the savings are quite |
14:28 | EX extraordinary you're getting 60 times |
14:32 | performance per dollar 100 times speed |
14:36 | up you only increase your power by 3x |
14:40 | 100 times speed up you only increase |
14:42 | your cost by 1.5x the savings are |
14:48 | incredible the savings are measured in |
14:51 | dollars it is very clear that many many |
14:55 | companies spend hundreds of millions of |
14:57 | dollars processing data in the the cloud |
15:00 | if it was accelerated it is not |
15:03 | unexpected that you could save hundreds |
15:06 | of millions of dollars now why is that |
15:10 | well the reason for that is very clear |
15:12 | we've been experiencing inflation for so |
15:15 | long in general purpose Computing now |
15:18 | that we finally came to we finally |
15:21 | determined to accelerate there's an |
15:24 | enormous amount of captured loss that we |
15:27 | can now regain a great deal of captured |
15:31 | retained waste that we can now relieve |
15:34 | out of the system and that will |
15:36 | translate into savings Savings in money |
15:39 | Savings in energy and that's the reason |
15:41 | why you've heard me |
15:44 | say the more you buy the more you |
15:51 | [Applause] |
15:54 | save and now I've shown you the |
15:56 | mathematics |
15:59 | it is not accurate but it is |
16:01 | correct okay that's called CEO |
16:04 | math CEO math is not accurate but it is |
16:07 | correct the more you buy the more you |
16:09 | save well accelerated Computing does |
16:12 | deliver extraordinary results but it is |
16:15 | not easy why is it that it saves so much |
16:18 | money but people haven't done it for so |
16:20 | long the reason for that is because it's |
16:22 | incredibly |
16:23 | hard there is no such thing as a |
16:27 | software that you can just run through a |
16:28 | c compiler and all of a |
16:31 | sudden that application runs a 100 times |
16:34 | faster that is not even logical if it |
16:37 | was possible to do that they would have |
16:39 | just changed the CPU to do that you in |
16:42 | fact have to rewrite the software that's |
16:44 | the hard part the software has to be |
16:47 | completely Rewritten so that you could |
16:50 | reactor |
16:53 | reexpress the algorithms that was |
16:56 | written on a CPU so that it could be |
16:58 | accelerated offloaded accelerated and |
17:00 | run in parallel that computer science |
17:05 | exercise is insanely hard well we've |
17:08 | made it easy for the world over the last |
17:10 | 20 years of course the very famous CNN |
17:13 | the Deep learning library that processes |
17:15 | neural networks we have a library for AI |
17:19 | physics that you could use for fluid |
17:21 | dynamics and many other applications |
17:23 | where the neural network has to obey the |
17:25 | laws of physics we have a great new |
17:28 | library called aial that is a Cuda |
17:31 | accelerated 5G radio so that we can |
17:34 | software Define and accelerate the |
17:37 | Telecommunications networks the way that |
17:39 | we've soft software defined the world's |
17:43 | networking um uh internet and so the |
17:46 | ability for us to accelerate that allows |
17:49 | us to turn all of Telecom into |
17:51 | essentially the same type of platform a |
17:54 | Computing platform just like we have in |
17:56 | the cloud kitho is a comp comput |
17:58 | computational lithography platform that |
18:01 | allows us to uh process the most |
18:04 | computationally intensive parts of Chip |
18:07 | manufacturing making the mask tsmc is in |
18:10 | the process of going to production with |
18:12 | kitho saving enormous amounts of energy |
18:14 | and more enormous amounts of money but |
18:17 | the goal for tsmc is to accelerate their |
18:20 | stack so that they're prepared for even |
18:23 | further advances in algorithm and more |
18:26 | computation for deeper and deeper uh |
18:29 | narrow and narrow transistors parab |
18:31 | brakes is our Gene sequencing Library it |
18:34 | is the highest throughput library in the |
18:36 | world for Gene sequencing coop is an |
18:38 | incredible library for combinatorial |
18:41 | optimization route planning optimization |
18:45 | the traveling salesman problem |
18:47 | incredibly complicated people just |
18:50 | people have well scientists have largely |
18:52 | concluded that you needed a quantum |
18:54 | computer to do that we created an |
18:56 | algorithm that runs on accelerated |
18:57 | Computing that runs Lightning Fast 23 |
19:00 | World Records we hold every single major |
19:03 | world record |
19:05 | today C Quantum is an emulation system |
19:09 | for a quantum computer if you want to |
19:11 | design a quantum computer you need a |
19:12 | simulator to do so if you want to design |
19:14 | Quantum algorithms you need a Quantum |
19:16 | emulator to do so how would you do that |
19:19 | how would you design these quantum |
19:20 | computers create these Quantum |
19:22 | algorithms if the quantum computer |
19:24 | doesn't exist while you use the fastest |
19:27 | computer in the world that exists today |
19:29 | and we call it of course Nvidia Cuda and |
19:32 | on that we have an emulator that |
19:35 | simulates quantum computers it is used |
19:38 | by several hundred thousand researchers |
19:41 | around the world it is integrated into |
19:43 | all the leading M leading Frameworks for |
19:46 | Quantum Computing and is used in |
19:48 | scientific super super Computing centers |
19:50 | all over the world CDF is an |
19:53 | unbelievable library for data |
19:56 | processing data processing cons consumes |
19:59 | the vast majority of cloud spend today |
20:02 | all of it should be accelerated qdf |
20:05 | accelerates the major libraries used in |
20:07 | the world spark many of you probably use |
20:10 | spark in your companies |
20:14 | pandas um a new one called polar and of |
20:18 | course uh networkx which is a graph |
20:20 | processing graph processing uh database |
20:23 | um library and so these are just some |
20:25 | examples there are so many more each one |
20:28 | of them had to be created so that we can |
20:31 | enable the ecosystem to take advantage |
20:33 | of accelerated Computing if we hadn't |
20:36 | created CNN Cuda alone wouldn't have |
20:39 | been able wouldn't have been possible |
20:41 | for all of the deep learning scientists |
20:43 | around the world to use because |
20:46 | Cuda and the algorithms that are used in |
20:49 | tensorflow and pytorch the Deep learning |
20:51 | algorithms the separation is too far |
20:54 | apart it's almost like trying to do |
20:56 | computer Graphics without openg it's |
20:59 | almost like doing data processing |
21:00 | without |
21:02 | SQL these domain specific libraries are |
21:06 | really The Treasure of our company we |
21:08 | have 350 of |
21:11 | them these libraries Is What It Takes |
21:14 | and what has made it possible for us to |
21:16 | have such open so many markets I'll show |
21:19 | you some other examples today well just |
21:22 | last week Google announced that they've |
21:25 | put CDF in the cloud and acceler pandas |
21:29 | pandas is the most popular data science |
21:32 | library in the world many of you in here |
21:34 | probably already use pandas it's used by |
21:36 | 10 million data scientists in the world |
21:38 | downloaded 170 million times each |
21:42 | month it is the Excel that is the |
21:45 | spreadsheet of data scientists well with |
21:49 | just one click you can now use pandas in |
21:52 | collab which is Google's cloud data |
21:55 | centers platform accelerated by qdf the |
21:58 | speed up is really incredible let's take |
22:00 | a |
22:02 | [Music] |
22:13 | look that was a great demo right didn't |
22:15 | take |
22:19 | [Applause] |
22:23 | long when you accelerate data processing |
22:26 | that fast demos don't take long |
22:30 | okay well Cuda has now achieved what |
22:34 | people call a Tipping Point but it's |
22:36 | even better than that Cuda has now |
22:38 | achieved a virtuous cycle this rarely |
22:42 | happens if you look at history and all |
22:44 | the Computing architecture Computing |
22:47 | Platforms in the case of microprocessor |
22:50 | CPUs it has been here for 60 |
22:54 | years it has not been changed for 60 |
22:57 | years at this level this way of doing |
23:01 | Computing accelerated Computing has been |
23:04 | around |
23:05 | has creating a new platform is extremely |
23:08 | hard because it's a chicken and egg |
23:10 | problem if there are no |
23:12 | developers that use your platform then |
23:15 | of course there will be no users but if |
23:17 | there are no users there are no install |
23:20 | base if there no install based |
23:21 | developers aren't interested in it |
23:23 | developers want to write software for a |
23:25 | large installed base but a large |
23:27 | installed base requires a lot of |
23:29 | applications so that users would create |
23:32 | that install base this chicken or the |
23:35 | egg problem has rarely been broken and |
23:38 | has taken us now 20 years one domain |
23:41 | library after another one acceleration |
23:43 | Library another and now we have 5 |
23:45 | million developers around the |
23:48 | world we serve every single industry |
23:51 | from Health Care Financial Services of |
23:53 | course the computer industry automotive |
23:55 | industry just about every major industry |
23:57 | in the world just about every field of |
24:00 | science because there are so many |
24:03 | customers for our architecture OEM and |
24:06 | cloud service providers are interested |
24:08 | in building our |
24:09 | systems system makers amazing system |
24:12 | makers like the ones here in Taiwan are |
24:14 | interested in building our systems which |
24:16 | then takes and offers more systems to |
24:18 | the |
24:19 | market which of course creates greater |
24:22 | opportunity for us which allows us to |
24:24 | increase our scale R&D scale which |
24:27 | speeds up the application even more well |
24:30 | every single time we speed up the |
24:33 | application the cost of computing goes |
24:36 | down this is that slide I was showing |
24:38 | you earlier 100x speed up translates to |
24:42 | 97 96% 98% savings and so when we go |
24:47 | from 100 x speed up to 200 x speed up to |
24:49 | a th000 x speed up the savings the |
24:52 | marginal cost of computing continues to |
24:55 | fall well of course |
24:58 | we believe that by reducing the cost of |
25:02 | computing |
25:04 | incredibly the market developers |
25:07 | scientists inventors will continue to |
25:10 | discover new algorithms that consume |
25:12 | more and more and more Computing so that |
25:15 | one |
25:17 | day something |
25:19 | happens that a phase shift happens that |
25:22 | the marginal cost of computing is so low |
25:26 | that a new way of using computers |
25:29 | emerge in fact that's what we're seeing |
25:31 | now over the years we have driven down |
25:34 | the marginal cost of computing in the |
25:36 | last 10 years in one particular |
25:38 | algorithm by a million times well as a |
25:42 | result it is now very |
25:46 | logical and very common |
25:48 | sense to train large language models |
25:52 | with all of the data on the internet |
25:55 | nobody thinks |
25:56 | twice this idea that you could create a |
26:00 | computer that could process so much data |
26:04 | to write its own software the emergence |
26:07 | of artificial intelligence was made |
26:09 | possible because of this complete belief |
26:11 | that if we made Computing cheaper and |
26:13 | cheaper and cheaper somebody's going to |
26:15 | find a great use well today Cuda has |
26:19 | achieved the virtual cycle installed |
26:21 | bases growing Computing cost is coming |
26:25 | down which causes more developers to |
26:27 | come up with more more |
26:29 | ideas which drives more |
26:32 | demand and now we're on in the beginning |
26:34 | of something very very important but |
26:36 | before I show you that I want to show |
26:39 | you what is not possible if not for the |
26:42 | fact that we create a Cuda that we |
26:45 | created the modern version of gener the |
26:48 | modern Big Bang of AI generative AI what |
26:52 | I'm about to show you would not be |
26:53 | possible this is Earth two the idea that |
26:58 | we would create a digital twin of the |
27:02 | earth that we would go and simulate the |
27:05 | Earth so that we could predict the |
27:08 | future of our planet to better |
27:12 | avert disasters or better understand the |
27:15 | impact of climate change so that we can |
27:17 | adapt better so that we could change our |
27:19 | habits now this digital twin of Earth is |
27:24 | probably one of the most ambitious |
27:26 | projects that the world's ever |
27:27 | undertaken and we're taking step large |
27:30 | steps every single year and I'll show |
27:32 | you results every single year but this |
27:33 | year we made some great breakthroughs |
27:35 | let's take a |
27:45 | look on Monday the storm will Veer North |
27:48 | again and approach Taiwan there are big |
27:50 | uncertainties regarding its path |
27:53 | different paths will have different |
27:55 | levels of impact on |
27:57 | Taiwan |
28:15 | [Music] |
28:19 | for NVIDIA Earth 2 |
28:29 | [Music] |
28:35 | C |
28:42 | [Music] |
28:55 | [Music] |
29:04 | [Music] |
29:23 | cordi |
29:27 | AI |
29:34 | [Music] |
29:52 | [Music] |
29:55 | for NVIDIA earth2 for |
30:02 | [Music] |
30:08 | [Music] |
30:25 | [Applause] |
30:32 | someday in the near future we will have |
30:36 | continuous weather |
30:38 | prediction at every at every square |
30:41 | kilometer on the |
30:42 | planet you will always know what the |
30:45 | climate's going to be you will always |
30:47 | know and this will run continuously |
30:50 | because we've trained the |
30:52 | AI and the AI requires so little |
30:55 | energy and so this is just an inedible |
30:58 | achievement I hope you you enjoyed it |
31:00 | and very |
31:01 | importantly |
31:14 | uh the truth is that was |
31:18 | a Jensen AI that was not |
31:23 | me I I wrote I wrote it but an AI Jensen |
31:27 | AI |
31:28 | had to say |
31:35 | it that is a miracle that is a miracle |
31:39 | indeed however in 2012 something very |
31:42 | important |
31:43 | happened because of our dedication to |
31:45 | advancing Cuda because of our dedication |
31:47 | to continuously improve the performance |
31:49 | of Drive the cost down |
31:52 | researchers discovered AI researchers |
31:54 | discovered Cuda in |
31:56 | 2012 that was |
31:58 | nvidia's first contact with |
32:02 | AI this was a very important day we had |
32:06 | the good |
32:07 | wisdom to work with the the scientists |
32:10 | to make it possible for a deep learning |
32:12 | to happen and alexnet achieved of course |
32:15 | a tremendous computer vision |
32:17 | breakthrough but the great wisdom was to |
32:20 | take a step back and understanding what |
32:22 | was the background what is the |
32:25 | foundation of deep learning what is this |
32:27 | long-term impact what is its potential |
32:31 | and we realized that this technology has |
32:34 | great potential to scale an algorithm |
32:36 | that was invented and discovered decades |
32:40 | ago all of a sudden because of more |
32:43 | data larger networks and very |
32:46 | importantly a lot more |
32:48 | compute all of a sudden deep learning |
32:51 | was able to achieve what no human |
32:54 | algorithm was able to now imagine if we |
32:58 | were to scale up the architecture even |
33:00 | more larger networks more data and more |
33:03 | compute what could be possible so we |
33:06 | dedicated ourselves to reinvent |
33:09 | everything after 2012 we changed the |
33:12 | architecture of our GPU to add tensor |
33:14 | course we invented MV link that was 10 |
33:19 | years ago |
33:20 | now |
33:22 | cdnn tensor |
33:24 | RT nickel we bought me enox tensor rtlm |
33:30 | the Triton inference server and all of |
33:33 | it came together on a brand new computer |
33:37 | nobody |
33:38 | understood nobody asked for it nobody |
33:41 | understood it and in fact I was certain |
33:44 | nobody wanted to buy it and so we |
33:46 | announced it at |
33:48 | GTC and open AI a small company in San |
33:51 | Francisco saw it and they asked me to |
33:54 | deliver one to them I delivered the |
33:57 | first dgx the F the world's first AI |
34:00 | supercomputer to open |
34:03 | AI in |
34:06 | 2016 well after that we continued to |
34:10 | scale from one AI supercomputer one AI |
34:14 | Appliance we scaled it up to large |
34:17 | supercomputers even larger by |
34:20 | 2017 the world discovered |
34:22 | Transformers so that we could train |
34:25 | enormous amounts of data and recognize |
34:28 | and learn patterns that are sequential |
34:30 | over large spans of time it is now |
34:34 | possible for for us to train these large |
34:36 | language models to understand and |
34:38 | achieve a breakthrough in natural |
34:40 | language |
34:42 | understanding and we kept going after |
34:44 | that we built even larger ones and then |
34:47 | in November |
34:50 | 2022 trained on |
34:52 | thousands tens of thousands of Nvidia |
34:55 | gpus in a very large AI |
34:58 | supercomputer open AI announced chat GPT |
35:02 | 1 million users after 5 |
35:05 | Days 1 million after 5 days a 100 |
35:09 | million after two months the fastest |
35:11 | growing application in history and the |
35:14 | reason for that is very simple it is |
35:17 | just so easy to use and it was so |
35:19 | magical to use to be able to interact |
35:22 | with a computer like it's |
35:24 | Human Instead of being clear about what |
35:28 | you want it's like the computer |
35:30 | understands your meaning it understands |
35:32 | your |
35:34 | intention oh I think here it it um asked |
35:37 | the the closest Night Market night as |
35:39 | you know um the night market is very |
35:42 | important to |
35:45 | me so when I was young uh I was I think |
35:49 | I was four and a half years old I used |
35:51 | to love going to the night market |
35:53 | because I I just love watching people |
35:55 | and and uh uh and so we went my parents |
35:58 | used to take us to the night market |
36:04 | uh |
36:07 | uh and and um uh and I love I love U I |
36:12 | love going and one day uh my face you |
36:15 | guys might might see that I have a large |
36:17 | scar on my face my face was cut because |
36:20 | somebody was washing their knife and I |
36:21 | was a little kid um but my memories of |
36:25 | of the Night Market is uh so deep |
36:28 | because of that and I used to love I I |
36:30 | just I still love going to the night |
36:31 | market and I just need to tell you guys |
36:33 | this the the Tona Night Market is is |
36:37 | really good because there's a lady uh |
36:40 | she's been working there for 43 |
36:43 | years she's the fruit lady and it's in |
36:46 | the middle of the stre the middle |
36:48 | between the two go find her okay she |
37:00 | she's really |
37:02 | terrific I think it would be funny after |
37:04 | this all of you go to see |
37:06 | her she every year she's doing better |
37:09 | and better her cart has improved and |
37:12 | yeah I just love watching her succeed |
37:15 | anyways anyways uh Chad GPT came along |
37:18 | and um and something is very important |
37:21 | in this |
37:22 | slide here let me show you something |
37:28 | this |
37:30 | slide okay and this |
37:34 | Slide the fundamental difference is |
37:38 | this until chat |
37:41 | GPT revealed it to the |
37:44 | world AI was all about |
37:48 | perception natural language |
37:51 | understanding computer vision speech |
37:55 | recognition it's all about |
37:58 | perception and |
38:00 | detection this was the first time the |
38:03 | world saw a generative |
38:06 | AI It produced |
38:08 | tokens one token at a time and those |
38:12 | tokens were words some of the tokens of |
38:15 | course could now be images or charts or |
38:20 | tables songs words speech |
38:24 | videos those tokens could be anything |
38:27 | they anything that that you can learn |
38:29 | the meaning of it could be tokens of |
38:33 | chemicals tokens of proteins |
38:36 | genes you saw earlier in Earth 2 we were |
38:40 | generating |
38:42 | tokens of the |
38:44 | weather we can we can learn physics if |
38:47 | you can learn physics you could teach an |
38:49 | AI model physics the AI model could |
38:51 | learn the meaning of physics and it can |
38:54 | generate physics we were scaling down to |
38:57 | to 1 kilm not by using filtering it was |
39:02 | generating and so we can use this method |
39:07 | to generate tokens for almost |
39:09 | anything almost anything of value we can |
39:13 | generate steering wheel control for a |
39:17 | car we can generate articulation for a |
39:20 | robotic |
39:22 | arm everything that we can learn we can |
39:26 | now generate |
39:28 | we have now arrived not at the AI era |
39:32 | but a generative AI era but what's |
39:34 | really important is |
39:38 | this this computer that started out as a |
39:42 | supercomputer has now evolved into a |
39:45 | Data Center and it |
39:48 | produces one thing it produces |
39:52 | tokens it's an AI |
39:55 | Factory this AI Factory |
39:58 | is generating creating producing |
40:01 | something of Great Value a new |
40:04 | commodity in the late |
40:06 | 1890s Nicola Tesla invented an AC |
40:11 | generator we invented an AI |
40:15 | generator the AC generator generated |
40:18 | electrons nvidia's AI generator |
40:21 | generates |
40:22 | tokens both of these things have large |
40:26 | Market opportunities |
40:28 | it's completely fungible in almost every |
40:31 | industry and that's why it's a new |
40:34 | Industrial |
40:35 | Revolution we have now a new Factory |
40:39 | producing a new commodity for every |
40:42 | industry that is of extraordinary value |
40:45 | and the methodology for doing this is |
40:47 | quite scalable and the methodology of |
40:49 | doing this is quite |
40:51 | repeatable notice how quickly so many |
40:54 | different AI models generative AI models |
40:57 | are being invented literally daily every |
41:00 | single industry is now piling |
41:03 | on for the very first |
41:06 | time the IT industry which is3 |
41:10 | trillion $3 trillion IT industry is |
41:14 | about to create something that can |
41:17 | directly serve a hundred trillion dollar |
41:20 | of Industry no longer just an instrument |
41:24 | for information storage or |
41:27 | data processing but a factory for |
41:31 | generating intelligence for every |
41:34 | industry this is going to be a |
41:36 | manufacturing industry not a |
41:39 | manufacturing industry of computers but |
41:42 | using the computers in |
41:44 | manufacturing this has never happened |
41:47 | before quite an extraordinary thing what |
41:50 | led started with accelerated Computing |
41:53 | led to AI led to generative Ai and now |
41:57 | an industrial |
41:59 | revolution now the impact to our |
42:02 | industry is also quite |
42:06 | significant of course we could create a |
42:08 | new commodity a new product we call |
42:11 | tokens for many Industries but the |
42:14 | impact to ours is also quite profound |
42:17 | for the very first time as I was saying |
42:19 | earlier in 60 years every single layer |
42:22 | of computing has been changed from CPUs |
42:25 | general purpose Computing to accelerated |
42:27 | GPU |
42:28 | Computing where the computer needs |
42:32 | instructions now computers process llms |
42:36 | large language models AI models and |
42:39 | whereas the Computing model of the past |
42:41 | is retrieval |
42:43 | based almost every time you touch your |
42:45 | phone some pre-recorded text or |
42:49 | pre-recorded image or pre-recorded video |
42:52 | is retrieved for you and recomposed |
42:56 | based on a recommender system to present |
42:58 | it to you based on your habits but in |
43:01 | the |
43:03 | future your computer will generate as |
43:05 | much as possible retrieve only what's |
43:09 | necessary and the reason for that is |
43:11 | because generated generated data |
43:13 | requires less energy to go fetch |
43:16 | information generated data also is more |
43:18 | contextually relevant it will encode |
43:21 | knowledge it will your understanding of |
43:23 | you and instead of |
43:27 | get that inform information for me or |
43:29 | get that file for me you just |
43:31 | say ask me for an answer and instead of |
43:36 | a |
43:36 | tool instead of your computer being a |
43:39 | tool that we use the computer will now |
43:43 | generate |
43:44 | skills it performs tasks and instead of |
43:48 | an industry that is producing software |
43:51 | was which was a revolutionary idea in |
43:53 | the early |
43:54 | 90s |
43:55 | remember the the idea that Microsoft |
43:58 | created for packaging software |
44:01 | revolutionize the PC industry without |
44:03 | packaged software what would we use the |
44:05 | PC to |
44:07 | do it drove this industry and now we |
44:12 | have a new Factory a new computer and |
44:15 | what we will run on top of this is a new |
44:18 | type of software and we call it Nims |
44:21 | Nvidia inference |
44:24 | microservices now what what happens is |
44:27 | the Nim runs inside this Factory and |
44:30 | this Nim is a pre-train model it's an AI |
44:35 | well this AI is of course quite complex |
44:39 | in itself but the the Computing stack |
44:42 | that runs AI are insanely complex when |
44:45 | you go and use chat GPT underneath their |
44:48 | stack is a whole bunch of software |
44:51 | underneath that prompt is a ton of |
44:53 | software and it's incredibly complex |
44:56 | because the models are large |
44:57 | billions to trillions of parameters it |
45:00 | doesn't run on just one computer it runs |
45:01 | on multiple computers it has to |
45:04 | distribute the workload across multiple |
45:05 | gpus tensor parallelism pipeline |
45:08 | parallelism data parallel all kinds of |
45:12 | parallelism expert parallelism all kinds |
45:15 | of parallelism Distributing the workload |
45:17 | across multiple gpus processing it as |
45:20 | fast as possible because if you are in a |
45:23 | factory if you run a factory you're |
45:26 | through put directly correlates to your |
45:29 | revenues your throughput directly |
45:31 | correlates to quality of service and |
45:33 | your throughput directly correlates to |
45:35 | the number of people who can use your |
45:36 | service we are now in a world where data |
45:39 | center throughput |
45:41 | utilization is vitally important it was |
45:44 | important in the past but not vially |
45:46 | important it was important in the past |
45:48 | but people don't measure it today every |
45:52 | parameter is measured start time uptime |
45:55 | utilization through put idle time you |
45:58 | name it because it's a factory when |
46:01 | something is a factory its operations |
46:04 | directly correlate to the financial |
46:07 | performance of the company and so we |
46:10 | realize that this is incredibly complex |
46:12 | for most companies to do so what we did |
46:15 | was we created this AI in a box and the |
46:19 | containers incredible months of |
46:22 | software inside this container is Cuda |
46:26 | CNN tensor RT Triton for inference |
46:31 | Services it is cloud native so that you |
46:33 | could Auto scale in a kubernetes |
46:35 | environment it has Management Services |
46:38 | and hooks so that you can monitor your |
46:39 | AIS it has common apis standard API so |
46:43 | that you could literally chat with this |
46:47 | box you download this Nim and you can |
46:51 | talk to it so long as you have Cuda on |
46:54 | your |
46:55 | computer which is now of course |
46:57 | everywhere it's in every cloud available |
46:59 | from every computer maker it is |
47:01 | available in hundreds of millions of PCS |
47:04 | when you download this you have an AI |
47:07 | and you can chat with it like chat GPT |
47:09 | all of the software is now integrated |
47:12 | 400 dependencies all integrated into one |
47:15 | we tested this Nim each one of these |
47:18 | pre-trained models against all kind our |
47:21 | entire install base that's in the cloud |
47:24 | all the different versions of Pascal and |
47:26 | ampers and Hoppers |
47:31 | and all kinds of different versions I |
47:33 | even forget |
47:37 | some Nims incredible invention this is |
47:41 | one of my favorites and of course as you |
47:45 | know we now have the ability to create |
47:48 | large language models and pre-trained |
47:49 | models of all kinds and we we have all |
47:52 | of these various versions whether it's |
47:55 | language based or Vision based or are |
47:56 | Imaging based or we have versions that |
47:59 | are available for healthc care digital |
48:02 | biology we have versions that are |
48:05 | digital humans that I'll talk to you |
48:07 | about and the way you use this just come |
48:09 | to ai. nvidia.com and today we uh just |
48:14 | posted up in hugging face the Llama 3 |
48:18 | Nim fully optimized it's available there |
48:21 | for you to try and you can even take it |
48:24 | with you it's available to you for free |
48:27 | and so you could run it in the cloud run |
48:29 | it in any Cloud you could download this |
48:31 | container put it into your own Data |
48:33 | Center and you could host it make it |
48:35 | available for your customers we have as |
48:38 | I mentioned all kinds of different |
48:40 | domains physics some of it is for |
48:44 | semantic retrieval called Rags Vision |
48:47 | languages all kinds of different |
48:49 | languages and the way that you use |
48:52 | it is connecting these microservices |
48:56 | into large applications one of the most |
48:59 | important applications in the coming |
49:00 | future of course is customer service |
49:03 | agents customer service agents are |
49:05 | necessary in just about every single |
49:07 | industry it represents trillions of |
49:11 | dollars of of customer service around |
49:13 | the world nurses are customer service |
49:16 | agents um in some ways some of them are |
49:19 | nonprescription or or non Diagnostics um |
49:23 | based nurses are essentially customer |
49:26 | service uh customer service for retail |
49:28 | for uh Quick Service Foods Financial |
49:31 | Services Insurance just tens and tens of |
49:34 | millions of customer service can now be |
49:37 | augmented by language models and |
49:41 | augmented by Ai and so these one these |
49:43 | boxes that you see are basically Nims |
49:46 | some of the NIMS are reasoning agents |
49:49 | given a task figure out what the mission |
49:52 | is break it down into a plan some of the |
49:55 | NIMS retrieve information some of the |
49:57 | NIMS might uh uh uh go and do search |
50:01 | some of the NIMS might use a tool like |
50:04 | Coop that I was talking about earlier it |
50:07 | could use a tool that uh could be |
50:09 | running on sap and so it has to learn a |
50:12 | particular uh language called abap maybe |
50:15 | some Nims have to uh uh do SQL queries |
50:18 | and so all of these Nims are experts |
50:22 | that are now assembled as a |
50:24 | team so what's happening |
50:28 | the application layer has been |
50:30 | changed what used to be applications |
50:33 | written with |
50:35 | instructions are now |
50:37 | applications that are assembling teams |
50:40 | assembling teams of |
50:42 | AIS very few people know how to write |
50:44 | programs almost everybody knows how to |
50:47 | break down a problem and assemble teams |
50:49 | very every company I believe in the |
50:51 | future will have a large collection of |
50:55 | Nims and you would bring down the |
50:57 | experts that you want you connect them |
51:00 | into a team and you you don't even have |
51:04 | to figure out |
51:05 | exactly how to connect |
51:08 | them you just give the mission to an |
51:12 | agent to a Nim to figure out who to |
51:16 | break the task down and who to give it |
51:19 | to and they that a that Central the |
51:22 | leader of the of the application if you |
51:25 | will the leader of the team would break |
51:26 | down the task and give it to the various |
51:29 | team members the team members would do |
51:31 | their perform their task bring it back |
51:34 | to the team leader the team leader would |
51:36 | reason about that and present an |
51:37 | information back to you just like humans |
51:42 | this is in our near future this is the |
51:45 | way applications are going to look now |
51:47 | of |
51:48 | course we could interact with these |
51:51 | large these AI services with text |
51:54 | prompts and speech prompts |
51:57 | however there are many applications |
51:59 | where we would like to interact with |
52:01 | what what is otherwise a humanlike form |
52:04 | we call them digital humans Nvidia has |
52:07 | been working on digital human technology |
52:09 | for some time let me show it to you and |
52:12 | well before I do that hang on a second |
52:14 | before I do that okay digital humans has |
52:18 | the potential of being a great interact |
52:21 | interactive agent with you they make |
52:23 | much more engaging they could be much |
52:25 | more empathetic |
52:27 | and of course um we have to uh cross |
52:33 | this incredible Chasm this uncanny Chasm |
52:37 | of realism so that the digital humans |
52:41 | would appear much more natural this is |
52:43 | of course our vision this is a vision of |
52:46 | where we love to go uh but let me show |
52:48 | you where we |
52:51 | are great to be in Taiwan before I head |
52:55 | out to the night market let's dive into |
52:56 | some exciting frontiers of digital |
52:59 | humans imagine a future where computers |
53:02 | interact with us just like humans can hi |
53:06 | my name is Sophie and I am a digital |
53:07 | human brand ambassador for Unique this |
53:11 | is the incredible reality of digital |
53:13 | humans digital humans will revolutionize |
53:16 | industries from customer service to |
53:20 | advertising and |
53:22 | gaming the possibilities for digital |
53:24 | humans are endless you the scans you |
53:27 | took of your current kitchen with your |
53:29 | phone they will be AI interior designers |
53:31 | helping generate beautiful |
53:33 | photorealistic suggestions and sourcing |
53:35 | the materials and Furniture we have |
53:37 | generated several design options for you |
53:39 | to choose from there'll also be AI |
53:42 | customer service agents making the |
53:44 | interaction more engaging and |
53:45 | personalized or digital healthcare |
53:48 | workers who will check on patients |
53:50 | providing timely personalized care um I |
53:53 | did forget to mention to the doctor that |
53:55 | I am allergic to pen in is it still okay |
53:57 | to take the medications the antibiotics |
54:00 | you've been prescribed cicin and |
54:02 | metronidazol don't contain penicillin so |
54:05 | it's perfectly safe for you to take them |
54:08 | and they'll even be AI brand ambassadors |
54:10 | setting the next marketing and |
54:12 | advertising Trends hi I'm IMA Japan's |
54:16 | first virtual |
54:17 | model new breakthroughs in generative Ai |
54:21 | and computer Graphics let digital humans |
54:24 | see understand and interact with us in |
54:27 | humanlike |
54:29 | ways H from what I can see it looks like |
54:33 | you're in some kind of recording or |
54:35 | production setup the foundation of |
54:37 | digital humans are AI models built on |
54:40 | multilingual speech recognition and |
54:43 | synthesis and llms that understand and |
54:46 | generate conversation |
54:57 | the AIS connect to another generative AI |
54:59 | to dynamically animate a lifelike 3D |
55:02 | mesh of a face and finally AI models |
55:07 | that reproduce lifelike appearances |
55:10 | enabling real-time path traced |
55:12 | subsurface scattering to simulate the |
55:14 | way light penetrates the skin scatters |
55:17 | and exits at various points giving skin |
55:20 | its soft and translucent |
55:22 | appearance Nvidia Ace is a suite of |
55:25 | digital human Technologies packaged as |
55:28 | easy to deploy fully optimized |
55:31 | microservices or Nims developers can |
55:34 | integrate Ace Nims into their existing |
55:36 | Frameworks engines and digital human |
55:39 | experiences neotron slm and llm Nims to |
55:44 | understand our intent and orchestrate |
55:46 | other models Reva speech Nims for |
55:49 | interactive speech and |
55:50 | translation audio to face and gesture |
55:53 | Nims for facial and body animation and |
55:56 | Omniverse RTX with dlss for neuro |
55:58 | rendering of skin and hair Ace Nims run |
56:01 | on Nvidia gdn a Global Network of Nvidia |
56:05 | accelerated infrastructure that delivers |
56:08 | low latency digital human processing to |
56:10 | over 100 |
56:13 | [Music] |
56:15 | regions |
56:23 | Hama pretty incredible while those |
56:27 | those Ace runs in a cloud but it also |
56:30 | runs on PCS we had the good wisdom of |
56:33 | including tensor core gpus in all of RTX |
56:37 | so we've been shipping AI gpus for some |
56:41 | time preparing ourselves for this day |
56:45 | the reason for that is very simple we |
56:46 | always knew that in order to create a |
56:48 | new Computing platform you need an |
56:50 | install base first eventually the |
56:52 | application will come if you don't |
56:55 | create the install BAS |
56:56 | how could the application come and so if |
56:59 | you build it they might not |
57:01 | come but if you build it if you don't |
57:03 | build it they cannot come and so we |
57:06 | installed every single RTX GPU with |
57:09 | tensor core G tensor core processing and |
57:12 | now we have 100 million GeForce RTX AIP |
57:16 | PCS in the world and we're shipping 200 |
57:19 | and this this copy Tex were featuring |
57:21 | four new amazing |
57:24 | laptops all of them are able to run AI |
57:28 | your future laptop your future PC will |
57:31 | become an AI it'll be constantly helping |
57:33 | you assisting you in the background the |
57:37 | PC will also run applications that are |
57:40 | enhanced by AI of course all your photo |
57:43 | editing your writing and your tools all |
57:45 | the things that you use will all be |
57:47 | enhanced by Ai and your PC will also |
57:52 | host applications with digital humans |
57:56 | that are AIS and so there are different |
57:58 | ways that AIS will manifest themselves |
58:01 | and become used in PCS but PCS will |
58:04 | become very important AI platform and so |
58:06 | where do we go from |
58:09 | here I spoke earlier about the scaling |
58:13 | of our data centers and every single |
58:15 | time we scaled we found a new phase |
58:19 | change when we scaled from dgx into |
58:22 | large AI supercomputers we enabled trans |
58:26 | Transformers to be able to train on |
58:27 | enormously large data data sets well |
58:31 | what happened was in the beginning the |
58:34 | data was human |
58:37 | supervised it required human labeling to |
58:40 | train AIS unfortunately there's only so |
58:43 | much you can human label Transformers |
58:47 | made it possible for unsupervised |
58:49 | learning to happen now Transformers just |
58:53 | look at an enormous amount of data or |
58:55 | look at enormous amount of video or look |
58:57 | at enormous amount of uh images and it |
59:00 | can learn from studying an enormous |
59:02 | amount of data find the patterns and |
59:04 | relationships |
59:05 | itself while the next generation of AI |
59:09 | needs to be physically based most of the |
59:12 | AIS today uh don't understand the laws |
59:14 | of physics it's not grounded in the |
59:17 | physical world in order for us to |
59:20 | generate uh uh images and videos and 3D |
59:25 | graphics and many physics phenomenons we |
59:29 | need AI that are physically based and |
59:33 | understand the laws of physics well the |
59:34 | way that you could do that is of course |
59:36 | learning from video is One Source |
59:38 | another way is synthetic data simulation |
59:40 | data and another way is using computers |
59:44 | to learn with each other this is really |
59:46 | no different than using alphago having |
59:49 | alphao play itsself self-play and |
59:52 | between the two |
59:54 | capabilities same capabil abilities |
59:56 | playing each other for a very long |
59:58 | period of time they emerge even smarter |
01:00:02 | and so you're going to start to see this |
01:00:04 | type of AI emerging well if the AI data |
01:00:08 | is synthetically generated and using |
01:00:12 | reinforcement learning it stands to |
01:00:14 | reason that the rate of data generation |
01:00:16 | will continue to advance and every |
01:00:18 | single time data generation grows the |
01:00:21 | amount of computation that we have to |
01:00:23 | offer needs to grow with it we are to |
01:00:26 | enter a phase where AIS can learn the |
01:00:28 | laws of physics and understand and be |
01:00:30 | grounded in physical world data and so |
01:00:32 | we expect that models will continue to |
01:00:34 | grow and we need larger gpus well |
01:00:37 | Blackwell was designed for this |
01:00:39 | generation this is Blackwell and it has |
01:00:42 | several very important Technologies uh |
01:00:45 | one of course is just the size of the |
01:00:47 | chip we took two of the largest a chip |
01:00:51 | that is as large as you can make it at |
01:00:53 | tsmc and we connected two of them |
01:00:56 | together with a 10 terabytes per second |
01:00:59 | link between the world's most advanced |
01:01:01 | cies connecting these two together we |
01:01:04 | then put two of them on a computer node |
01:01:07 | connected with a gray CPU the gray CPU |
01:01:10 | could be used for several things in the |
01:01:13 | training situation it could use it could |
01:01:16 | be used for fast checkpoint and restart |
01:01:18 | in the case of inference and generation |
01:01:21 | it could be used for storing context |
01:01:23 | memory so that the AI has memory and |
01:01:27 | understands uh the context of the the |
01:01:29 | conversation you would like to have it's |
01:01:30 | our second generation Transformer engine |
01:01:33 | Transformer engine allows us to adapt |
01:01:36 | dynamically to a lower precision based |
01:01:39 | on the Precision and the range necessary |
01:01:42 | for that layer of computation uh this is |
01:01:45 | our second generation GPU that has |
01:01:47 | secure AI so that you could you could |
01:01:50 | ask your service providers to protect |
01:01:52 | your AI from being either stolen from |
01:01:56 | theft or tampering this is our fifth |
01:01:59 | generation MV link MV link allows us to |
01:02:02 | connect multiple gpus together and I'll |
01:02:04 | show you more of that in a second and |
01:02:05 | this is also our first generation with a |
01:02:08 | reliability and availability engine this |
01:02:12 | system this Ras system allows us to test |
01:02:15 | every single transistor flip-flop memory |
01:02:18 | on chip memory off chip so that we |
01:02:22 | can in the field determine whether a |
01:02:25 | particular chip is uh failing the mtbf |
01:02:30 | the meantime between failure of a |
01:02:32 | supercomputer with 10,000 |
01:02:34 | gpus is a measured in |
01:02:37 | hours the meantime between failure of a |
01:02:41 | supercomputer with 100,000 gpus is |
01:02:44 | measured in |
01:02:45 | minutes and so the ability for a |
01:02:49 | supercomputer to run for a long period |
01:02:51 | of time and train a model that could |
01:02:53 | last for several months is practically |
01:02:56 | impossible if we don't invent |
01:02:58 | Technologies to enhance its reliability |
01:03:01 | reliability would of course enhance its |
01:03:03 | uptime which directly affects the cost |
01:03:06 | and then lastly decompression engine |
01:03:08 | data processing is one of the most |
01:03:09 | important things we have to do we added |
01:03:11 | a data compression engine decompression |
01:03:13 | engine so that we can pull data out of |
01:03:15 | storage 20 times faster than what's |
01:03:18 | possible today well all of this |
01:03:20 | represents Blackwell and I think we have |
01:03:22 | one here that's in production during GTC |
01:03:25 | I showed you Blackwell in a prototype |
01:03:28 | State um the other |
01:03:34 | side this is why we |
01:03:51 | practice ladies and gentlemen this is |
01:03:53 | Blackwell |
01:04:00 | black black well is in |
01:04:03 | production incredible amounts of |
01:04:08 | Technology this is our production board |
01:04:12 | this is the most complex highest |
01:04:14 | performance computer the world's ever |
01:04:18 | made this is the gray |
01:04:21 | CPU and these are you could see each one |
01:04:24 | of these blackw dieses two of them |
01:04:26 | connected together you see that it is |
01:04:29 | the largest Dy the the largest chip the |
01:04:32 | world makes and then we connect two of |
01:04:34 | them together with a 10 terabyte per |
01:04:36 | second |
01:04:39 | link okay and that makes the Blackwell |
01:04:42 | computer and the performance is |
01:04:45 | incredible take a look at |
01:04:48 | this so |
01:04:53 | um you see you see our |
01:04:56 | uh the the computational the flops the |
01:05:00 | AI flops uh for each generation has |
01:05:04 | increased by a thousand times in eight |
01:05:07 | years Mo's law in eight |
01:05:11 | years is something along the lines of oh |
01:05:15 | I don't know maybe 40 |
01:05:19 | 60 and in the last eight years Mo's law |
01:05:23 | has gone a lot lot less and so just to |
01:05:27 | compare even Mo's law at its best of |
01:05:31 | times compared to what Blackwell could |
01:05:34 | do so the amount of computations is |
01:05:36 | incredible and when whenever we bring |
01:05:38 | the computation High the thing that |
01:05:41 | happens is the cost goes down and I'll |
01:05:44 | show you what we've done is we've |
01:05:46 | increased through its computational |
01:05:48 | capability the energy used to train a |
01:05:54 | gp4 |
01:05:56 | 2 trillion parameter 8 trillion |
01:05:59 | Tokens The amount of energy that is used |
01:06:03 | has gone down by 350 times Well Pascal |
01:06:08 | would have taken |
01:06:10 | 1,000 gigatt hours 1,000 gwatt hours |
01:06:14 | means that it would take a gigawatt data |
01:06:17 | center the world doesn't have a gigawatt |
01:06:19 | data center but if you had a gigawatt |
01:06:21 | data center it would take a month if you |
01:06:24 | had a 100 watt 100 megawatt data center |
01:06:26 | it would take about a year and so nobody |
01:06:30 | would of course um uh create such a |
01:06:34 | thing and um and that's the reason why |
01:06:36 | these large language models chat GPT |
01:06:38 | wasn't possible only eight years ago by |
01:06:41 | us driving down the increasing the |
01:06:43 | performance the energy efficent while |
01:06:45 | keeping and improving energy efficent |
01:06:48 | efficiency along the way we've now taken |
01:06:50 | with Blackwell what used to be 1,000 |
01:06:53 | gwatt hours to three and incredible |
01:06:56 | Advance uh 3 gwatt hours if it's a if |
01:07:01 | it's a um uh uh a 10,000 gpus for |
01:07:05 | example it would only take a cou 10,000 |
01:07:08 | gpus I guess it would take a few days 10 |
01:07:11 | days or so so the amount of advance in |
01:07:15 | just eight years is incredible well this |
01:07:18 | is for inference this is for token |
01:07:21 | generation Our token generation |
01:07:24 | performance has made it possible for us |
01:07:26 | to drive the energy down by three four |
01:07:29 | 45,000 |
01:07:31 | times 177,000 Jewels per token that was |
01:07:36 | Pascal 177,000 Jewels is kind of like |
01:07:39 | two light bulbs running for two days it |
01:07:43 | would take two light bulbs running for |
01:07:45 | two days amounts of energy 200 Watts |
01:07:48 | running for two days to generate one |
01:07:51 | token of GPT |
01:07:53 | 4 it takes about three tokens to |
01:07:56 | generate one |
01:07:57 | word and so the amount of energy used |
01:08:01 | necessary for Pascal to generate gp4 and |
01:08:04 | have a chat GPT experience with you was |
01:08:06 | practically impossible but now we only |
01:08:09 | use 0.4 jewles per token and we can |
01:08:13 | generate tokens at incredible rates and |
01:08:16 | very little energy okay so Blackwell is |
01:08:19 | just an enormous leap well even so it's |
01:08:23 | not big enough and so we have to build |
01:08:26 | even larger |
01:08:27 | machines and so the way that we build it |
01:08:29 | is called dgx so this is this is our |
01:08:33 | Blackwell chips and it goes into djx |
01:08:43 | systems that's why we should |
01:08:49 | practice so this is a dgx Blackwell this |
01:08:54 | has this is air cooled has eight of |
01:08:57 | these gpus inside look at the size of |
01:09:01 | the heat sinks on these |
01:09:04 | gpus about 15 kilowatts 15,000 watts and |
01:09:10 | completely air cooled this version |
01:09:13 | supports x86 and it's it goes into the |
01:09:16 | infrastructure that we've been shipping |
01:09:18 | Hoppers into however if you would like |
01:09:21 | to have liquid cooling we have a new |
01:09:24 | system and this new |
01:09:27 | system is based on this board and we |
01:09:30 | call it mgx for |
01:09:33 | modular and this modular |
01:09:35 | system you won't be able to see this |
01:09:38 | this can they see this can you see this |
01:09:43 | you can the are |
01:09:47 | you |
01:09:50 | okay I |
01:09:53 | say and so this this is the mgx system |
01:09:56 | and here's the two uh black Blackwell |
01:09:59 | boards so this one node has four |
01:10:01 | Blackwell chips these four Blackwell |
01:10:04 | chips this is liquid |
01:10:08 | cooled nine of |
01:10:11 | them nine of them uh well 72 of these 72 |
01:10:16 | of these gpus 72 of these G gpus are |
01:10:20 | then connected together with a new MV |
01:10:23 | link this is MV link switch |
01:10:26 | fifth |
01:10:28 | generation and the mvlink switch is a |
01:10:30 | technology Miracle this is the most |
01:10:32 | advanced switch the world's ever made |
01:10:34 | the data rate is |
01:10:36 | insane and these switches connect every |
01:10:38 | single one of these black Wells to each |
01:10:41 | other so that we have one giant 72 GPU |
01:10:46 | Blackwell well the |
01:10:50 | benefit the benefit of this is that in |
01:10:53 | one domain one GPU domain this now looks |
01:10:57 | like one GPU this one GPU has 72 versus |
01:11:01 | the last generation of eight so we |
01:11:03 | increased it by nine times the amount of |
01:11:05 | bandwidth we've increased by 18 times |
01:11:07 | the AI flops we've increased by 45 times |
01:11:11 | and yet the amount of power is only 10 |
01:11:13 | times this is 100 kilow and that is 10 |
01:11:18 | kilow and that's for one now of course |
01:11:22 | well you could always connect more of |
01:11:24 | these together and I'll show you to do |
01:11:25 | that in a second but what's the miracle |
01:11:27 | is this chip this MV link chip people |
01:11:30 | are starting to awaken to the importance |
01:11:32 | of this mvlink chip as it connects all |
01:11:35 | these different gpus together because |
01:11:36 | the large language models are so large |
01:11:39 | it doesn't fit on just one GPU doesn't |
01:11:41 | fit on one just just one node it's going |
01:11:44 | to take the entire rack of gpus like |
01:11:47 | this new dgx that I that I was just |
01:11:49 | standing next to uh to hold a large |
01:11:51 | language model that are tens of |
01:11:53 | trillions of parameters large mvlink |
01:11:56 | switch in itself is a technology Miracle |
01:11:58 | it's 50 billion transistors 74 ports at |
01:12:00 | 400 gbits each four |
01:12:03 | links cross-sectional bandwidth of 7 7.2 |
01:12:06 | terabytes per second but one of the |
01:12:08 | important things is that it has |
01:12:10 | mathematics inside the switch so that we |
01:12:12 | can do reductions which is really |
01:12:15 | important in deep learning right on the |
01:12:16 | Chip And so this is what this is what um |
01:12:21 | a dgx looks like now and a lot of people |
01:12:24 | ask us |
01:12:26 | um you know they say and there's this |
01:12:29 | there's this confusion about what Nvidia |
01:12:31 | does and and um how is it |
01:12:36 | possible that that Nvidia became so big |
01:12:39 | building |
01:12:41 | gpus and so there's an impression that |
01:12:44 | this is what a GPU looks like now this |
01:12:47 | is a GPU this is one of the most |
01:12:48 | advanced gpus in the world but this is a |
01:12:50 | gamer GPU but you and I know that this |
01:12:53 | is what a GPU looks like this is one |
01:12:57 | GPU ladies and gentlemen dgx |
01:13:01 | GPU you |
01:13:06 | know the back of this GPU is the MV link |
01:13:12 | spine the MV link spine is 5,000 |
01:13:17 | wires two |
01:13:21 | miles and it's right here |
01:13:27 | this is an mvlink |
01:13:30 | spine and it connects |
01:13:33 | 70 two gpus to each |
01:13:36 | other this is a electrical mechanical |
01:13:41 | Miracle the transceivers makes it |
01:13:44 | possible for us to drive the entire |
01:13:46 | length in Copper and as a result this |
01:13:49 | switch the Envy switch Envy link switch |
01:13:52 | driving the EnV link spine in Copper |
01:13:55 | makes it possible for us to save 20 |
01:13:57 | kilowatt in one rack 20 kilowatt could |
01:14:01 | now be used for processing just an |
01:14:03 | incredible achievement so this is the |
01:14:06 | the MV links |
01:14:09 | [Applause] |
01:14:17 | spine |
01:14:23 | W I went down |
01:14:27 | today and if you and even this is not |
01:14:30 | big |
01:14:31 | enough even this is not big enough for |
01:14:33 | AI factories so we have to connect it |
01:14:35 | all together with very high-speed |
01:14:37 | networking well we have two types of |
01:14:40 | networking we have infiniband which has |
01:14:42 | been used uh in supercomputing and AI |
01:14:44 | factories all over the world and it is |
01:14:48 | growing incredibly fast for us however |
01:14:50 | not every data center can handle |
01:14:52 | infiniband because they've already |
01:14:54 | invested their ecosystem in Ethernet for |
01:14:56 | too long and it does take some specialty |
01:14:59 | and some expertise to manage infiniband |
01:15:02 | switches and infiniband networks and so |
01:15:04 | what we've done is we've brought the |
01:15:06 | capabilities of infiniband to the |
01:15:09 | ethernet architecture which is |
01:15:11 | incredibly hard and the reason for that |
01:15:13 | is |
01:15:14 | this ethernet was designed for high |
01:15:18 | average |
01:15:20 | throughput because every single note |
01:15:23 | every single computer is connected to a |
01:15:25 | different person on the internet and |
01:15:27 | most of the communications is the data |
01:15:29 | center with somebody on the other side |
01:15:30 | of the internet |
01:15:33 | however deep learning in AI |
01:15:36 | factories the gpus are not communicating |
01:15:39 | with people on the internet mostly it's |
01:15:41 | communicating with each |
01:15:43 | other they're communicating with each |
01:15:45 | other because they're all they're |
01:15:47 | collecting partial products and they |
01:15:50 | have to reduce it and then redistribute |
01:15:52 | it chunks of partial products reduction |
01:15:56 | redistribution that traffic is |
01:15:58 | incredibly |
01:16:00 | bursty and it is not the average through |
01:16:03 | put that matters it's the last arrival |
01:16:07 | that matters because if you're reducing |
01:16:10 | collecting partial products from |
01:16:12 | everybody if I'm trying to take all of |
01:16:23 | your so it's not the average throughput |
01:16:26 | is whoever gives me the answer |
01:16:29 | last okay ethernet has no provision for |
01:16:32 | that and so there are several things |
01:16:35 | that we have to create we created an |
01:16:37 | endtoend architecture so that the the |
01:16:39 | Nick and the switch can communicate and |
01:16:42 | we applied four different Technologies |
01:16:44 | to make this possible number one Nvidia |
01:16:46 | has the world's most advanced RDMA and |
01:16:49 | so now we have the ability to have a |
01:16:51 | network level RDMA for ethernet that is |
01:16:54 | incredibly great number two we have |
01:16:56 | congestion control the switch does |
01:16:59 | Telemetry at all times incredibly fast |
01:17:02 | and whenever the uh the uh the the gpus |
01:17:06 | or the the nxs are sending too much |
01:17:08 | information we can tell them to back off |
01:17:10 | so that it doesn't create hotspots |
01:17:12 | number three adaptive |
01:17:14 | routing ethernet needs to transmit and |
01:17:18 | receive in |
01:17:20 | order we see |
01:17:23 | congestions or we see |
01:17:25 | uh ports that are not currently being |
01:17:27 | used irrespective of the ordering we |
01:17:29 | will send it to the available ports and |
01:17:32 | Bluefield on the other end reorders it |
01:17:35 | so that it comes back in order that |
01:17:38 | adaptive routing incredibly powerful and |
01:17:40 | then lastly noise |
01:17:42 | isolation there's more than one model |
01:17:45 | being trained or something happening in |
01:17:47 | the data center at all times and their |
01:17:49 | noise and their traffic could get into |
01:17:51 | each other and causes Jitter and so when |
01:17:54 | it's when the noise of one training |
01:17:57 | model one model training causes the last |
01:17:59 | arrival to end up too late it really |
01:18:03 | slows down to training well overall |
01:18:05 | remember you have you have built A5 |
01:18:09 | billion doll or3 billion Data Center and |
01:18:12 | you're using this for training if the |
01:18:16 | utilization Network |
01:18:18 | utilization |
01:18:20 | was 40% lower and as a result the |
01:18:23 | training time was 20% |
01:18:26 | longer |
01:18:27 | the5 billion data center is effectively |
01:18:31 | like a $6 billion data center so the |
01:18:33 | cost is incredible the cost impact is |
01:18:36 | quite |
01:18:37 | High ethernet with Spectrum X basically |
01:18:42 | allows us to improve the performance so |
01:18:44 | much that the network is basically free |
01:18:46 | and so this is really quite an |
01:18:48 | achievement we're very we have a whole |
01:18:51 | pipeline of ethernet products behind us |
01:18:53 | this is Spectrum x800 |
01:18:55 | it is uh 51.2 terabits per second and |
01:18:59 | 256 |
01:19:01 | radic the next one coming is 512 Rix is |
01:19:04 | one year from now 512 Ric and that's |
01:19:07 | called Spectrum x800 Ultra and the one |
01:19:10 | after that is x1600 but the important |
01:19:13 | idea is this x800 is designed for tens |
01:19:17 | of |
01:19:18 | thousands tens of thousands of gpus x800 |
01:19:23 | ultra is designed for hundreds of |
01:19:25 | thousands of gpus and x1600 is designed |
01:19:29 | for millions of gpus the days of |
01:19:32 | millions of GPU data centers are coming |
01:19:36 | and the reason for that is very simple |
01:19:38 | of course we want to train much larger |
01:19:40 | models but very importantly in the |
01:19:43 | future almost every interaction you have |
01:19:45 | with the Internet or with a computer |
01:19:48 | will likely have a generative AI running |
01:19:50 | in the cloud somewhere and that |
01:19:52 | generative AI is working with you |
01:19:55 | interacting with you generating videos |
01:19:58 | or images or text or maybe a digital |
01:20:01 | human and so you're interacting with |
01:20:03 | your computer almost all the time and |
01:20:05 | there's always a generative AI connected |
01:20:08 | to that some of it is on Prem some of it |
01:20:10 | is on your device and a lot of it could |
01:20:13 | be in the cloud these generative AIS |
01:20:15 | will also do a lot of reasoning |
01:20:17 | capability instead of just oneshot |
01:20:19 | answers they might iterate on answers so |
01:20:22 | that it improve the quality of the |
01:20:24 | answer before they give it to you and so |
01:20:26 | the amount of generation we're going to |
01:20:27 | do in the future is going to be |
01:20:29 | extraordinary let's take a look at all |
01:20:31 | of this put together now tonight this is |
01:20:34 | our first nighttime |
01:20:38 | keynote I want to |
01:20:46 | thank I want to thank all of you for |
01:20:48 | coming out tonight at 7 o'clock and and |
01:20:51 | so what I'm about to show you has a new |
01:20:55 | Vibe okay there's a new Vibe this is |
01:20:58 | kind of the nighttime keynote Vibe so |
01:21:01 | enjoy |
01:21:04 | [Applause] |
01:21:09 | this |
01:21:10 | [Music] |
01:21:20 | black let's go go go go go |
01:21:24 | [Music] |
01:21:32 | [Music] |
01:21:49 | okay |
01:21:52 | just going to |
01:21:55 | [Music] |
01:22:04 | come on yeah yeah yeah |
01:22:10 | yeah get y get |
01:22:17 | y let's |
01:22:23 | go h uh uh uh |
01:22:28 | uh the more you back the more you safe |
01:22:31 | with top a |
01:22:33 | ey speed |
01:22:38 | [Music] |
01:22:43 | [Applause] |
01:22:52 | efficient now you can't do that on a |
01:22:54 | morning |
01:22:56 | [Applause] |
01:22:59 | keynote I think that style of keynote |
01:23:02 | has never been done in compy Tex |
01:23:05 | ever might be the |
01:23:13 | last only Nvidia can pull off |
01:23:16 | that only I could do that |
01:23:25 | Blackwell of course uh is the first |
01:23:28 | generation of Nvidia platforms that was |
01:23:31 | launched at the beginning at the right |
01:23:33 | as the world knows the generative AI era |
01:23:37 | is here just as the world realized the |
01:23:40 | importance of AI factories just as the |
01:23:42 | beginning of this new Industrial |
01:23:44 | Revolution uh we have so much support |
01:23:47 | nearly every o every computer maker |
01:23:50 | every CSP every GPU Cloud sovereign |
01:23:54 | clouds even telecommunication |
01:23:58 | companies Enterprises all over the world |
01:24:01 | the amount of success the amount of |
01:24:03 | adoption the amount of of uh enthusiasm |
01:24:06 | for Blackwell is just really off the |
01:24:08 | charts and I want to thank everybody for |
01:24:15 | that we're not stopping there uh during |
01:24:19 | this during the time of this incredible |
01:24:21 | growth we want to make sure that we |
01:24:23 | continue to in enance performance |
01:24:25 | continue to drive down cost cost of |
01:24:28 | training cost of inference and continue |
01:24:31 | to scale out AI capabilities for every |
01:24:35 | company to embrace the further we the |
01:24:37 | further performance we drive up the |
01:24:39 | Greater the cost decline Hopper platform |
01:24:42 | of course was the most successful data |
01:24:45 | center processor probably in history and |
01:24:48 | this is just an incredible incredible |
01:24:50 | success story however Blackwell is here |
01:24:54 | and and every single platform as you'll |
01:24:56 | notice are several things you got the |
01:24:58 | CPU you have the GPU you have MV link |
01:25:00 | you have the Nick and you have the |
01:25:01 | switch the the mvlink switch the uh |
01:25:05 | connects all of the gpus together as |
01:25:08 | large of a domain as we can and whatever |
01:25:10 | we can do we connect it with large um |
01:25:12 | very large and very high-speed switches |
01:25:14 | every single generation as you'll see is |
01:25:17 | not just a GPU but it's an entire |
01:25:19 | platform we build the entire platform we |
01:25:22 | integrate the entire platform into into |
01:25:24 | an AI Factory supercomputer however then |
01:25:27 | we disaggregate it and offer it to the |
01:25:29 | world and the reason for that is because |
01:25:31 | all of you could create interesting and |
01:25:35 | Innovative configurations and and and |
01:25:38 | all kinds of different uh uh Styles and |
01:25:41 | and to fit different uh data centers and |
01:25:43 | different customers in different places |
01:25:45 | some of it for Edge some of it for Telco |
01:25:47 | and all of the different Innovation are |
01:25:49 | possible if it would we made the systems |
01:25:51 | open and make it possible for you to |
01:25:53 | innovate and so we design it integrated |
01:25:57 | but we offer it to you disintegrated so |
01:25:59 | that you could create modular systems |
01:26:02 | the Blackwell platform is |
01:26:03 | here our company is on a one-year |
01:26:06 | Rhythm we're our basic philosophy is |
01:26:09 | very simple one build the entire data |
01:26:12 | center scale disaggregate it and sell it |
01:26:15 | to you in Parts on a one-year Rhythm and |
01:26:18 | we push everything to technology limits |
01:26:21 | whatever tsmc process technology will |
01:26:25 | push it to the absolute limits whatever |
01:26:27 | packaging technology push it to the |
01:26:28 | absolute limits whatever memory |
01:26:29 | technology push it to Absolute limits |
01:26:31 | seres technology Optics technology |
01:26:34 | everything is pushed to the |
01:26:36 | Limit well and then after that do |
01:26:39 | everything in such a way so that all of |
01:26:41 | our software runs on this entire install |
01:26:44 | base software inertia is the single most |
01:26:47 | important thing in computers it'll when |
01:26:50 | a computer is backwards compatible and |
01:26:52 | it's architecturally compatible with all |
01:26:53 | the software that has already been |
01:26:55 | created your ability to go to market is |
01:26:57 | so much faster and so the velocity is |
01:27:02 | incredible when we can take advantage of |
01:27:04 | the entire installed base of software |
01:27:05 | that's already been created well |
01:27:07 | Blackwell is here next year is Blackwell |
01:27:11 | Ultra just as we had h100 and h200 |
01:27:15 | you'll probably see some pretty exciting |
01:27:17 | New Generation from us for Blackwell |
01:27:20 | Ultra again again push to the limits and |
01:27:24 | the Next Generation Spectrum switches I |
01:27:26 | mentioned well this is the very first |
01:27:29 | time that this next click has been |
01:27:34 | made and I'm not sure yet whe I'm going |
01:27:36 | to regret this or |
01:27:45 | not we have code names in our company |
01:27:48 | and we try to keep them very secret uh |
01:27:51 | often times uh most of the employees |
01:27:53 | don't even know but our next Generation |
01:27:55 | platform is called Reuben the Reuben |
01:27:57 | platform the Reuben platform um I'm I'm |
01:28:01 | not going to spend much time on it uh I |
01:28:03 | know what's going to happen you're going |
01:28:04 | to take pictures of it and you're going |
01:28:05 | to go look at the fine prints uh and |
01:28:07 | feel free to do that so we have the |
01:28:09 | Ruben platform and one year later we' |
01:28:11 | have the Reuben um Ultra platform all of |
01:28:14 | these chips that I'm showing you here |
01:28:15 | are all in full development 100% of them |
01:28:19 | and the rhythm is one year at the limits |
01:28:22 | of Technology all 100% are |
01:28:24 | architecturally compatible so this is |
01:28:26 | this is basically what Nvidia is |
01:28:28 | building and all of the riches of |
01:28:30 | software on top of it so in a lot of |
01:28:31 | ways the last 12 |
01:28:34 | years from that moment of imag net and |
01:28:39 | US realizing that the future of |
01:28:40 | computing was going to radically change |
01:28:42 | to today is really exactly as I was |
01:28:45 | holding up earlier GeForce pre 20102 and |
01:28:49 | Nvidia |
01:28:51 | today the company has really trans form |
01:28:54 | tremendously and I want to thank all of |
01:28:55 | our partners here for supporting us |
01:28:57 | every step along the way this is the |
01:29:00 | Nvidia black wall |
01:29:10 | platform let me talk about what's |
01:29:14 | next the next wave of AI is physical ai |
01:29:19 | ai that understands the laws of physics |
01:29:21 | AI that can work among us and so they |
01:29:26 | have to understand the world model so |
01:29:30 | that they understand how to interpret |
01:29:32 | the world how to perceive the world they |
01:29:34 | have to of course have excellent |
01:29:36 | cognitive capabilities so they can |
01:29:38 | understand us understand what we asked |
01:29:41 | and perform the |
01:29:43 | tasks in the |
01:29:45 | future robotics is a much more per |
01:29:49 | pervasive idea of course when I say |
01:29:52 | robotics there's a humanoid robotics |
01:29:54 | that's usually the representation of |
01:29:56 | that but that's not at all true |
01:29:59 | everything is going to be robotic all of |
01:30:02 | the factories will be robotic the |
01:30:04 | factories will orchestrate robots and |
01:30:07 | those robots will be building products |
01:30:10 | that are |
01:30:11 | robotic robots interacting with robots |
01:30:15 | building products that are robotic well |
01:30:19 | in order for us to do that we need to |
01:30:20 | make some breakthroughs and let me show |
01:30:22 | you the video |
01:30:30 | the era of Robotics has |
01:30:33 | arrived one day everything that moves |
01:30:36 | will be |
01:30:38 | autonomous researchers and companies |
01:30:40 | around the world are developing robots |
01:30:43 | powered by physical |
01:30:45 | AI physical AIS are models that can |
01:30:49 | understand instructions and autonomously |
01:30:52 | perform complex tasks in in the real |
01:30:55 | world multimodal llms are breakthroughs |
01:30:59 | that enable robots to learn perceive and |
01:31:03 | understand the world around them and |
01:31:04 | plan how they'll act and from Human |
01:31:08 | demonstrations robots can now learn the |
01:31:10 | skills required to interact with the |
01:31:12 | world using gross and fine motor |
01:31:15 | skills one of the integral Technologies |
01:31:18 | for advancing robotics is reinforcement |
01:31:20 | learning just as llms need rlh F or |
01:31:24 | reinforcement learning from Human |
01:31:26 | feedback to learn particular skills |
01:31:29 | generative physical AI can learn skills |
01:31:31 | using reinforcement learning from |
01:31:33 | physics feedback in a simulated World |
01:31:37 | these simulation environments are where |
01:31:39 | robots learn to make decisions by |
01:31:41 | performing actions in a virtual world |
01:31:44 | that obeys the laws of |
01:31:46 | physics in these robot gyms a robot can |
01:31:50 | learn to perform complex and dynamic |
01:31:53 | tasks safely and quickly refining their |
01:31:56 | skills through millions of Acts of trial |
01:31:58 | and error we built Nvidia Omniverse as |
01:32:02 | the operating system where physical AIS |
01:32:05 | can be |
01:32:06 | created Omniverse is a development |
01:32:09 | platform for virtual world simulation |
01:32:12 | combining realtime physically based |
01:32:15 | rendering physics |
01:32:17 | simulation and generative AI |
01:32:21 | Technologies in Omniverse robots can |
01:32:23 | learn how to be |
01:32:25 | robots they learn how to autonomously |
01:32:27 | manipulate objects with Precision such |
01:32:30 | as grasping and handling objects or |
01:32:34 | navigate environments autonomously |
01:32:36 | finding optimal paths while avoiding |
01:32:39 | obstacles and |
01:32:41 | Hazards learning in Omniverse minimizes |
01:32:44 | the Sim to real Gap and maximizes the |
01:32:47 | transfer of learned |
01:32:49 | behavior building robots with generative |
01:32:52 | physical AI requires requires three |
01:32:55 | computers Nvidia AI supercomputers to |
01:32:58 | train the models Nvidia Jetson Orin and |
01:33:02 | Next Generation Jetson Thor robotic |
01:33:04 | supercomputer to run the |
01:33:06 | models an Nvidia Omniverse where robots |
01:33:09 | can learn and refine their skills in |
01:33:11 | simulated |
01:33:13 | worlds we build the platforms |
01:33:16 | acceleration libraries and AI models |
01:33:19 | needed by developers and companies and |
01:33:22 | allow them to use any |
01:33:24 | or all of the stacks that suit them best |
01:33:28 | the next wave of AI is here robotics |
01:33:32 | powered by physical AI will |
01:33:34 | revolutionize |
01:33:35 | [Music] |
01:33:45 | Industries this isn't the future this is |
01:33:48 | happening |
01:33:50 | now there are several ways that we're |
01:33:53 | going to serve the Market the first |
01:33:54 | we're going to create platforms for each |
01:33:57 | type of robotic systems one for robotic |
01:34:00 | factories and warehouses one for robots |
01:34:03 | that manipulate things one for robots |
01:34:06 | that move and one for uh robots that are |
01:34:10 | humanoid and so each one of these |
01:34:12 | robotic robotics platform is like almost |
01:34:16 | everything else we do a computer |
01:34:18 | acceleration libraries and pre-train |
01:34:20 | models computers acceleration libraries |
01:34:22 | pre-train models and we test everything |
01:34:25 | we train everything and integrate |
01:34:27 | everything inside Omniverse where |
01:34:29 | Omniverse is as the video was saying |
01:34:32 | where robots learn how to be robots now |
01:34:35 | of course the ecosystem of robotic |
01:34:37 | warehouses is really really complex it |
01:34:40 | takes a lot of companies a lot of tools |
01:34:43 | a lot of technology to build a modern |
01:34:45 | warehouse and warehouses are |
01:34:47 | increasingly robotic one of these days |
01:34:49 | will be fully robotic and so in each one |
01:34:52 | of these ecosystems |
01:34:54 | we have sdks and apis that are connected |
01:34:57 | into the software industry sdks and apis |
01:35:01 | connected into Edge AI industry um and |
01:35:04 | companies and then also of course uh |
01:35:07 | systems that are designed for plcs and |
01:35:09 | robotic systems for the odms it's then |
01:35:12 | integrated by integrators uh created for |
01:35:15 | ultimately uh building warehouses uh for |
01:35:18 | customers here we have an example of |
01:35:20 | kenmac building a robotic Warehouse for |
01:35:24 | giant giant |
01:35:32 | group okay and then here now let's talk |
01:35:35 | about factories factories has a |
01:35:37 | completely different ecosystem and |
01:35:39 | foxcon is building some of the world's |
01:35:41 | most advanced factories their ecosystem |
01:35:44 | again Edge Computers and Robotics |
01:35:47 | software for Designing the factories the |
01:35:51 | workflows programming the robots and of |
01:35:54 | course PLC computers that orchestrate uh |
01:35:57 | the digital factories and the AI |
01:35:59 | factories we have sdks that are |
01:36:01 | connected into each one of these |
01:36:03 | ecosystems as well this is happening all |
01:36:06 | over Taiwan foxcon has bu is building |
01:36:11 | digital twins of their |
01:36:12 | factories Delta is building digital |
01:36:15 | twins of their |
01:36:16 | factories by the way half is real half |
01:36:19 | is digital half is |
01:36:21 | Omniverse pegatron is building digital |
01:36:24 | twins of their robotic |
01:36:27 | factories witron is building digital |
01:36:30 | twins of their robotic |
01:36:33 | factories and this is really cool this |
01:36:35 | is a video of foxc con's new Factory |
01:36:38 | let's take a |
01:36:43 | look demand for NVIDIA accelerated |
01:36:45 | Computing is skyrocketing as the world |
01:36:49 | modernizes traditional data centers into |
01:36:51 | generative AI factories |
01:36:55 | foxcon the world's largest electronics |
01:36:57 | manufacturer is gearing up to meet this |
01:37:00 | Demand by building robotic factories |
01:37:02 | with Nvidia Omniverse and |
01:37:04 | AI Factory planners use Omniverse to |
01:37:07 | integrate facility and Equipment data |
01:37:09 | from leading industry applications like |
01:37:11 | seens team Center X and Autodesk |
01:37:15 | Revit in the digital twin they optimize |
01:37:18 | floor layout and line configurations and |
01:37:21 | locate optimal camera placements to |
01:37:23 | monit monitor future operations with |
01:37:25 | Nvidia Metropolis powered Vision |
01:37:29 | AI virtual integration saves planners on |
01:37:32 | the enormous cost of physical change |
01:37:36 | orders during construction the foxcon |
01:37:39 | teams use the digital twin as the source |
01:37:41 | of Truth to communicate and validate |
01:37:44 | accurate equipment |
01:37:47 | layout the Omniverse digital twin is |
01:37:50 | also the robot gym where foxcon |
01:37:52 | developers train and test Nvidia ISAC AI |
01:37:55 | applications for robotic perception and |
01:37:58 | manipulation and Metropolis AI |
01:38:00 | applications for Sensor |
01:38:03 | Fusion in Omniverse foxcon simulates two |
01:38:06 | robot AIS before deploying runtimes to |
01:38:09 | Jets and |
01:38:11 | computers on the assembly line they |
01:38:13 | simulate Isaac manipulator libraries and |
01:38:15 | AI models for automated Optical |
01:38:18 | inspection for object identification |
01:38:20 | defect detection and trajectory planning |
01:38:25 | to transfer hgx systems to the test pods |
01:38:28 | they simulate Isaac perceptor powered |
01:38:30 | fobot amrs as they perceive and move |
01:38:33 | about their environment with 3D mapping |
01:38:36 | and |
01:38:37 | reconstruction with Omniverse foxcon |
01:38:40 | builds their robotic factories that |
01:38:42 | orchestrate robots running on Nvidia |
01:38:44 | ISAC to build Nvidia AI supercomputers |
01:38:48 | which in turn train foxcon robots |
01:39:00 | [Applause] |
01:39:06 | so a robotic Factory is designed with |
01:39:09 | three computers train the AI on Nvidia |
01:39:12 | AI you have the robot running on the PLC |
01:39:16 | systems uh for orchestrating the the uh |
01:39:18 | the factories and then you of course |
01:39:20 | simulate everything inside Omniverse |
01:39:23 | well the robotic arm and the robotic |
01:39:25 | amrs are also the same way three |
01:39:27 | computer systems the difference is the |
01:39:30 | two omniverses will come together so |
01:39:33 | they'll share one virtual space when |
01:39:35 | they share one virtual space that |
01:39:38 | robotic arm will become |
01:39:40 | inside the robotic |
01:39:42 | Factory and again three three uh three |
01:39:46 | computers and we provide the computer |
01:39:49 | the acceleration layers and pre-train uh |
01:39:52 | pre-trained AI models we've connected |
01:39:54 | Nvidia manipulator and Nvidia Omniverse |
01:39:58 | with seens the world's leading |
01:40:00 | Industrial Automation software and |
01:40:01 | systems company this is really a |
01:40:03 | fantastic partnership and they're |
01:40:05 | working on factories all over the world |
01:40:08 | semantic Pi aai now integrates Isaac |
01:40:11 | manipulator and sematic pick aai uh runs |
01:40:15 | operates AB CA yasawa uh Fook Universal |
01:40:20 | robotics um and techman and so so seens |
01:40:24 | is a fantastic integration we have all |
01:40:27 | kinds of other Integrations let's take a |
01:40:31 | look arcbest is integrating Isaac |
01:40:34 | perceptor into Vox smart autonomy robots |
01:40:37 | for enhanced object recognition and |
01:40:40 | human motion tracking in Material |
01:40:43 | Handling byd Electronics is integrating |
01:40:46 | Isaac manipulator and perceptor into |
01:40:49 | their AI robots to enhance manufacturing |
01:40:52 | efficiencies for Global |
01:40:54 | customers ideal works is building Isaac |
01:40:57 | perceptor into their iwos software for |
01:41:00 | AI robots in Factory |
01:41:03 | Logistics intrinsic an alphabet company |
01:41:06 | is adopting Isaac manipulator into their |
01:41:08 | flowstate platform to advance robot |
01:41:11 | grasping Gideon is integrating Isaac |
01:41:14 | perceptor into Trey AI powered forklifts |
01:41:17 | to advance AI enabled Logistics Argo |
01:41:21 | robotics is adopting Isaac ctor into |
01:41:24 | perception engine for advanced |
01:41:26 | vision-based |
01:41:27 | amrs Solomon is using Isaac manipulator |
01:41:30 | AI models in their acup piic 3D software |
01:41:33 | for industrial |
01:41:35 | manipulation techman robot is adopting |
01:41:37 | Isaac Sim and manipulator into TM flow |
01:41:41 | accelerating automated Optical |
01:41:43 | inspection pteridine robotics is |
01:41:46 | integrating Isaac manipulator into |
01:41:48 | polycope X for cobots and Isaac |
01:41:51 | perceptor into mirr AMR |
01:41:54 | vention is integrating Isaac manipulator |
01:41:57 | into machine Logic for AI manipulation |
01:42:00 | [Music] |
01:42:04 | robots robotics is here physical AI is |
01:42:08 | here this is not science fiction and |
01:42:10 | it's being used all over Taiwan and just |
01:42:13 | really really exciting and that's the |
01:42:16 | factory the robots inside and of course |
01:42:19 | all the products going to be robotics |
01:42:20 | there are two very high volume robotics |
01:42:22 | products products one of course is the |
01:42:25 | self-driving car or cars that have a |
01:42:27 | great deal of autonomous capability |
01:42:29 | Nvidia again builds the entire stack |
01:42:31 | next year we're going to go to |
01:42:33 | production with the Mercedes Fleet and |
01:42:35 | after that in 2026 the jlr fleet uh we |
01:42:39 | offer the full stack to the world |
01:42:41 | however you're welcome to take whichever |
01:42:43 | Parts uh which whichever layer of our |
01:42:46 | stack just as the entire the entire uh |
01:42:49 | uh Drive stack is open the next high |
01:42:52 | volume robotics product that's going to |
01:42:54 | be manufactured by robotic factories |
01:42:57 | with robots inside will likely be |
01:42:59 | humanoid robots and this has great |
01:43:03 | progress in recent years in both the |
01:43:06 | cognitive capability because of |
01:43:08 | foundation models and also the world |
01:43:11 | understanding capability that we're in |
01:43:12 | the process of developing I'm really |
01:43:14 | excited about this area because |
01:43:16 | obviously the easiest robot to adapt |
01:43:18 | into the world are human robots because |
01:43:20 | we built the world for us we also have |
01:43:23 | have the vast the most amount of data to |
01:43:25 | train these robots than other types of |
01:43:27 | robots because we have the same uh |
01:43:30 | physique and so the amount of training |
01:43:32 | data we can provide through |
01:43:33 | demonstration capabilities and video |
01:43:35 | capabilities is going to be really great |
01:43:37 | and so we're going to see a lot of |
01:43:38 | progress in this area well I think we |
01:43:41 | have um some robots that we' like to uh |
01:43:48 | welcome here we go about my size |
01:43:59 | and we have we have some friends to join |
01:44:01 | us so the fut the future of robot |
01:44:04 | robotics is here the next wave of AI and |
01:44:08 | and of course you know Taiwan |
01:44:12 | builds computers with |
01:44:14 | keyboards you build computers for your |
01:44:17 | pocket you build computers for data |
01:44:20 | centers in the cloud in the future |
01:44:23 | you're going to build computers that |
01:44:25 | walk and computers that roll you know |
01:44:29 | around and um so these are all just |
01:44:32 | computers and as as it turns out uh the |
01:44:35 | technology is very similar to the |
01:44:37 | technology of building um all of the |
01:44:39 | other computers that you already build |
01:44:40 | today so this is going to be a really |
01:44:42 | extraordinary uh Journey for |
01:44:45 | us well uh I want to thank I want to |
01:44:55 | I want to thank I want to I have I've |
01:44:57 | I've made one last video if you don't |
01:44:58 | mind uh something that that uh uh uh we |
01:45:03 | we really enjoyed making um and if you |
01:45:06 | let's run |
01:45:14 | it |
01:45:16 | [Music] |
01:45:21 | Taiwan |
01:45:23 | [Music] |
01:45:30 | [Music] |
01:45:42 | [Music] |
01:45:51 | for |
01:46:02 | [Music] |
01:46:21 | for for |
01:46:23 | [Music] |
01:46:39 | [Music] |
01:46:50 | [Music] |
01:46:52 | [Applause] |
01:47:02 | [Music] |
01:47:11 | [Applause] |
01:47:16 | thank |
01:47:18 | you I love you guys thank you |
01:47:23 | [Applause] |
01:47:29 | thank you all for coming have a great |
01:47:31 | comput |
01:47:31 | [Applause] |
01:47:36 | text thank you |