Understanding Aesthetics with Deep Learning

Understanding Aesthetics with Deep Learning(devblogs.nvidia.com)

83 points by jipy9 9 years ago | 29 comments

saltenhav 9 years ago |

Jurgen Schmidhuber has an interesting perspective on the rewarding aspects of art and music.

To paraphrase, "Artists (and observers of art) get rewarded for making (and observing) novel patterns: data that is neither arbitrary (like incompressible random white noise) nor regular in an already known way, but regular in way that is new with respect to the observer's current knowledge, yet learnable (that is, after learning fewer computational resources are needed to encode the data)".

In other words, enjoyment of art is about learning (easy) patterns. Schmidhuber likens "fun" to the improvement of an observer's ability to compress a scene.

The source is worth the read: http://people.idsia.ch/~juergen/creativity.html

phreeza 9 years ago |

I wonder how much our sense of aesthetics has to do with the perceived scarcity or effort of creation needed. I remember the first HDR photos looked absolutely mind-blowing to me, but now, as the process has been automated and is ubiquitous, it just looks tacky.

Grazester 9 years ago | |

HDR only looks tacky when it's done tacky, ridiculously amping up the saturation and removing all the shadows

samlevine 9 years ago | |

"fashion is clothing so terrible we have to change them every six months"

Govannon 9 years ago | | |

"Fashion is a form of ugliness so intolerable that we have to alter it every six months."

zaaakk 9 years ago | |

read adorno

yusee 9 years ago | | |

Have you read him? If so, why not synthesize his arguments so we can all understand?

phreeza 9 years ago | | |

Care to elaborate? Does he go into this?

kafkaesq 9 years ago |

A better title would have been "Using ML to rank and classify images according to aesthetics." Which in itself is quite impressive. But nowhere do we see indications that these algorithms "understand" the images, in any meaningful sense.

ibuildthings 9 years ago | |

The author here. I used the term "understanding", not as in machines understanding the images, but more as scientific attempt in understanding aesthetics. ( <snippet from the text>"empowering me to develop systems for understanding images from a computational and scientific perspective"</snippet ends> ).

kafkaesq 9 years ago | | |

Fair enough -- thanks for clarifying.

yusee 9 years ago |

Aesthetics is a game of cat and mouse. Artists create some new things. Then critics and theorists observe the patterns of composition, color, proportion, etc., that are popular. These rules are canonized in books. Then artists challenge the rules.
The comparisons to music theory in this thread are apt. Music theory is always behind music production.
How can you understand aesthetics without understanding creativity?

BanzaiTokyo 9 years ago |

I believe that there are factors that go beyond visible composition. Just a though experiment, I imagine that the brain would evaluate easthetics of two similar images differently depending on whether it is an image of an object it recognizes or not - when evaluating the image with an object other qualities of the object (that are not necessarily visible in the image) will be taken into account.

maldusiecle 9 years ago | |

Yeah, exactly. No good critic of photography thinks that subject matter is irrelevant, that you can understand pictures as if they were abstract compositions of light and color. You'd might as well try to read a poem in an unknown language. This algorithm might learn to identify certain cliches, but it'll never learn what makes a picture powerful.

randcraw 9 years ago | | |

You have to wonder, though. Is it impossible that there's a "music theory" for images/paintings/art that explains the mechanics of what makes them more compelling vs less compelling? I suspect there is, at least to some degree.
Obviously images convey much more information than music, so any theory that doesn't encompass the semantics of the subject will miss most of the signal. But is there a theory for the presentation and composition of the subject? To some degree, I'm confident there is.
Some of the methods used to debug the deep learning of images already do a fair job of showing the locus of focus in the image where the DNN found maximum information. I can see such a technique discovering many of the techniques used by artists and photographers to direct the observer's eye or juxtapose objects that conflict.

posterboy 9 years ago | | |

We listen to foreign music a lot and still find pleasure in the voices. Although I'm mostly speaking from my experience with English before I had learned it, which is yet rather close to German, so YMMV
>No True Scotsman thinks that subject matter is irrelevant
fixed that for you

leblancfg 9 years ago | |

I agree. Although one could imagine that concepts conveyed in a photograph could be extracted and abstracted as vectors -- just like word2vec and its successors. Of course, there is a long way to go before we hit "human understanding" parity, but I think ideas from [1], [2] and [3] could be extrapolated in doing just that.
[1] Deep Visual-Semantic Alignments for Generating Image Descriptions - cs.stanford.edu/people/karpathy/cvpr2015.pdf
[2] Deep Learning for Content-Based Image Retrieval - www.research.larc.smu.edu.sg/mlg/papers/MM14-fp336-hoi.pdf
[3] Deep Learning for Content-Based Image Retrieval - www.cs.rutgers.edu/~elgammal/pub/MTA_2014_Saleh.pdf

controll 9 years ago |

I don't think computers would be able to understand aesthetics. It is a really high-level concept. Plus, I think deep-learning is a marketing mambo-jambo and does not perform much better than a linear SVM.

visarga 9 years ago | |

Then why are we using deep convolutional networks for state of the art vision and speech when we could just plug an SVM with handcrafted features? From what I know, error rates in vision dropped from 25% to less than 5% since deep learning. That's no trifle, especially at the higher end of the accuracy scale. It's very hard to conquer those last few percents.