Understanding Aesthetics with Deep Learning(devblogs.nvidia.com) |
Understanding Aesthetics with Deep Learning(devblogs.nvidia.com) |
To paraphrase, "Artists (and observers of art) get rewarded for making (and observing) novel patterns: data that is neither arbitrary (like incompressible random white noise) nor regular in an already known way, but regular in way that is new with respect to the observer's current knowledge, yet learnable (that is, after learning fewer computational resources are needed to encode the data)".
In other words, enjoyment of art is about learning (easy) patterns. Schmidhuber likens "fun" to the improvement of an observer's ability to compress a scene.
The source is worth the read: http://people.idsia.ch/~juergen/creativity.html
The comparisons to music theory in this thread are apt. Music theory is always behind music production.
How can you understand aesthetics without understanding creativity?
Obviously images convey much more information than music, so any theory that doesn't encompass the semantics of the subject will miss most of the signal. But is there a theory for the presentation and composition of the subject? To some degree, I'm confident there is.
Some of the methods used to debug the deep learning of images already do a fair job of showing the locus of focus in the image where the DNN found maximum information. I can see such a technique discovering many of the techniques used by artists and photographers to direct the observer's eye or juxtapose objects that conflict.
>No True Scotsman thinks that subject matter is irrelevant
fixed that for you
[1] Deep Visual-Semantic Alignments for Generating Image Descriptions - cs.stanford.edu/people/karpathy/cvpr2015.pdf
[2] Deep Learning for Content-Based Image Retrieval - www.research.larc.smu.edu.sg/mlg/papers/MM14-fp336-hoi.pdf
[3] Deep Learning for Content-Based Image Retrieval - www.cs.rutgers.edu/~elgammal/pub/MTA_2014_Saleh.pdf
We listen to foreign music and find pleasure in the voices because a voice itself is expressive, independent of language: a cry, a laugh, an imprecation. The same is true of light, shape, etc., but artists who work with these qualities independent of their reference to objects tend to choose media other than photography, for obvious reasons.
He shows how art has a semi-autonomous 'truth-content' (a la Kant), but one that is always the product and embodiment of unresolved contradictions within the larger social fabric. (a la Hegel, and more so Marx).
Returning to the NVidia article at the root of this thread, this passage pops out as problematic (certainly for Adorno, but also in general): "In our case, we use supervised machine learning, with a dataset of photographs pre-categorized as aesthetically pleasing or not." There is a real sense of question-begging going on here. And Adorno would say that this approach forecloses the very fact that the definitional boundaries of these categories are constantly shifting and lack any real social stability.
edit: links.... http://www.upress.umn.edu/book-division/books/aesthetic-theo... http://plato.stanford.edu/entries/adorno/#4 http://plato.stanford.edu/entries/kant-aesthetics/
The work is not at all contradictory to Adorno, especially in the sense that it is explicitly trying to as non-reductionist as possible, and assuming notion of aesthetics is a dynamic entity .
There is a finite pattern in the dataset; more interesting, it has its interesting share of subtleties ( for example, as opposed to a image classification problems), and the technological question is whether we can capture these.
But there is another interesting data question. For our work, we curated our training set with the help of expert curators. But the dataset itself is a metamorphising entity; i.e. it is subject to revision ( it is a continuous process for us at the moment), but more interestingly it is a chance for open debate between our curators. In some sense, technology allow to codify and challenge our notion of aesthetics ( especially with the evolution in our training sets) at a given point of time.
Analysis of these elements (form, line, space, color, and texture) is usually a part of the sort of art criticism you'd find in academic studio art, art history, or even just the New York Times art section.
The visual design field has a similar, extended set of elements for describing the formal elements of a design piece.
In both art and design, works are usually considered effective if they use the formal elements of art/design to support what you refer to as the semantics of the subject. That's a broad generalization, but you see it in practice a lot, so it seems like a fair thing to say.
Academic art history is starting to feel the influence of machine learning and computer vision precisely because computers can be trained to recognize the formal elements of art and associate their use with movements and historical periods. There are way more detailed articles than this one, but this will get you started if you're interested in this sort of thing:
https://www.technologyreview.com/s/537366/the-machine-vision...
This is particularly challenging in art (as compared e.g. to financial markets) because much of what defines new art is specifically what makes it different from what has come before it. That is to say, art, by its nature, will always beat any rules you try to design, because that is what it does, indeed, what is must do.
The proof is in the pudding: that machine learning systems can be designed to learn the statistical trends in a body of works and then generate similar art, done since at least the 80s if not earlier, evokes the very definition of the detractive term "cookie cutter art." "Good" art then, by contradiction, is exactly that art that does not fit into such a model -- plus "something".
Surely it is that "something" we'd like to find, but I am afraid that using rule- or statistically-based analysis to help curators sort through art, even with the prescribed notion that this should help them find "diamonds in the rough", it will generate an echo chamber in which the next diamond, which by definition is quite different from diamonds that came before it, to remain undiscovered, buried in a pile of sorted spam.
It is for this reason that I believe that despite the advances in machine learning, nothing will ever replace the past-time of "crate digging" for finding gems. The DJs job will never completely die.
... I will add: That is not to say that tools for automatically understanding and measuring aspects of a photo or piece of music are not useful for artists as a way of judging their own work and making decisions. But it is exactly those artists that will look at the "goodness indicator" drop one notch while they make a change, and say, "I'm fine with that", who will produce the next important work.
No, you cannot recognize anything as art before you haven't layed out the rules to describe art.
>But there is no guarantee that these will allow you to predict what makes future art
Prediction from Samples is covered by the sampling theorem, which theoretically holds for periodic signals and an infinite amount of samples, only. Although, in practice the output from my soundcard is rather fine, and facial recognition software works, too.