Nepriateľské útoky na neurónové siete

Neural networks today achieve state-of-the-art results in image recognition, but they can be fooled by surprisingly small perturbations to the input. So-called adversarial attacks alter pixels or lighting conditions so subtly that a human barely notices the difference, while the model remains confidently certain of the wrong answer. The talk showed how such attacks work and why they matter for security in practice.

From neural networks to adversarial examples

A neural network is an interconnection of nodes that transforms an input into an output and is often used for classification, for example of images of cats and dogs. As early as 2014, authors showed that adding a very small, deliberately designed noise to the original image is enough to make the network change its mind with very high confidence, even though a human can barely see the change. The attack can be untargeted (it suffices to cause an error) or targeted, when the attacker forces the model to choose a specific incorrect class. An illustrative example: a photograph of a panda, after a subtle modification, starts being classified as a gibbon with near-absolute confidence.

Download presentation (353kB)

Adversarial attacks on neural networks

From neural networks to adversarial examples

Fooling face recognition: glasses and infrared LEDs

When light changes traffic signs

Xiaolu Hou

Adversarial attacks on neural networks

From neural networks to adversarial examples

Fooling face recognition: glasses and infrared LEDs

When light changes traffic signs

Xiaolu Hou

Páčil sa ti článok? Zdieľaj ho a povedz o ňom aj ostatným