Introduction or which AI am I talking about
First of all, I am interested in universal AI as a machine for achieving complex goals. That is, some kind of hardware and software complex, which can be said: make an airplane that will cost $ 100, fly 1000 kilometers at a speed of 800 km / h and carry 5 people. Or like this: cure such and such a person from cancer at the terminal stage.
AI should be able to cope with such tasks, if it is physically possible at all. And if it is impossible, then achieve the result that is as close as possible to the given one.
At the moment, I see two ways how to get universal AI.
The first way is systems like reinforcement learning. They connect to the sensors and actuators of some robot, and they also have a reward signal. Reinforcement learning (hereinafter RL) operates to receive on average as many rewards as possible. And the reward channel is the primary way to tell AI what we want from it.
- , GPT-3, . . - , . , … GPT-3 “ - ” - . “ - ” - , . “ ?” GPT-3 , . GPT-3 .
Reinforcement Learning
.
- , RL , , , .
- . , .
,
, RL . - , 224224, , - . , , , , , . , - , , , - , , , . .
. :
1) , . , . ~1000-2000 . , , .
2) . - , “ ”, . , . , “ ”, , .
RL , .
RL . . RL - . - .
, RL . . RL , , .
?
-, RL . , , . - . .
- . RL , . , Doom, , , . RL , . RL - - , , . - - , , - , "" "".
, RL : Doom. .
RL , . , , - Exit.
RL , - , , , , , , Exit. , .
, , .
?
RL . . , .
, RL . , - - .
: RL , . , - , - . , RL - .
: , . , - . , RL , Exit. , “” - - , 5%, . , - , .
. , RL , . , , … : RL , , . , , , , , .
, RL , . . .
, , . - . : . -, . -, , , .
-. - Model-Based . “ ” - , , . ( , ) . , , -.
- , , ->, (, )-> .
.
. , . . RL , . , , , - . , RL , : , , .
, , , , , .
, , . , - RL.
? RL , , . , , . , - .
, . , , - , , . , - , - .
- - . , , .
- , . . , RL .
: ? , ?
: - , , … , .
: , , . , , , . , , , , 110 - , .
?
, : , , ( ). , “” “”.
, , . , , . “ , ”, - .
“ ” - , , . . , . , , . , , , . , -, “” , .
? ?
- . , -, , . , , RL . - , , - .
, . , . , RL “” - , , . ?
, . , - .
, , GAN. ( , - RL) , , . , . “” “” - , , - . .
, RL , - . , .
-
, , . , , , , - , .
, - RL - , . RL, . - - , RL.
RL . ( --) - : , , … , .
, . , , , .
, . , . . - “ ”, .
, , , , , . , , , - .
, RL . , . , RL - , , . , . .
, , , . , , -, . , . : 1000$ 100$ . ? , , . , . , . , - , - - . , , , . , , RL, , , , RL .
- , ( ) - , . , - , . RL - , . , - RL . , .
, , , . , .
, , .
I intended this article as a way to provoke dialogue. Surely I'm wrong somewhere and there are more cunning solutions than those that I managed to think of. So detailed comments and interesting debates are welcome!