Bottlenecks for AI
AI applications: automation everywhere
All activities are susceptible to be automated, if there is enough data to train models.
The automation equation: 𦾠Automation probability = š¼ Coherence * š Data
- Coherence: similarity between cases, all configurations are known. We can improve coherence by controlling the environment (ex: battery hens, factory, autonomous driving on private property, prefab in construction)
- Data: the less coherent the environment, the more data we need to describe it (e.g.: driving in Florida on a private property, driving in US, driving in Toulouse).
Source
WORKING PAPER The Future of Employment & Q.Chenevier's own opinions
Perception & manipulation
2 unsolved challenges:
- Perception in unstructured environments: houses, fields
- New or irregular objects handling, with a soft grasp and through learning of mistakes
The human hand is a wonderful tool, full of sensors:
- Position
- Toughness / Roughness
- Heat
- Humidity
Social intelligence
Social intelligence tasks require several hard-to-automate skills:
- Recognize emotions
- Being empathetic with our interlocutor ("compute its state of mind")
- Exploit this context and the past interactions
"Recognizing emotions is a challenge, but the ability to answer in a smart way to this information is even harder."
Nevertheless, Woebot Health: Relational Agent for Mental Health, a CBT (Cognitive Behavioral Therapies) chatbot, demonstrates shows that many CBT therapists' sessions could be automated.
It doesn't replace the therapist, but can reduce significantly the needed amount of sessions to do a therapy.
Creative intelligence
Creative intelligence is about creating new ideas which have a creative value.
Generating novelty is easy, the main challenge is to know how to describe our creative values to encode them in a software.
Demo: Image generation from text
- Generate your image with stable diffusion
- Inspire from prompt examples in diffusionDB to create your own prompt
- Repeat step 1 & 2 until you have a nice result
Image generation from text prompt: explanation
CLIP: image ā text
VQGAN: vector ā image
VQGAN+CLIP: vector ā image ā text
The input (vector) is optimized so that the output (text) matches the text prompt