Alias-Free GAN(nvlabs.github.io) |
Alias-Free GAN(nvlabs.github.io) |
I had ran a few chaotic experiments with StyleCLIP a few months ago which would work very well with smooth interpolation: https://minimaxir.com/2021/04/styleclip/
Now it seems to actually learn the topology lines of the human face [0], as 3D artists would learn them [1] when they study anatomy. It also uses quad grids and even places the edge loops and poles in similar places.
[0] https://nvlabs-fi-cdn.nvidia.com/_web/alias-free-gan/img/ali... [1] https://i.pinimg.com/originals/6b/9a/0c/6b9a0c2d108b2be75bf7...
The comparisons are illuminative: StyleGAN2's mapping of texture to specific pixel location looks very similar to poorly implemented video-game textures. Perhaps future GAN improvements could come from tricks used in non-AI graphic development.
Still has the telltale of mismatched ears and/or earrings. This seems the most reliable way to recognize them. Well, and the nondescript background.
I wonder what dataset you could even use to tell a GAN about human internals. 3D renders of a skull with various layers removed?
> In a further test we created two example cinemagraphs that mimic small-scale head movement and facial animation in FFHQ. The geometric head motion was generated as a random latent space walk along hand-picked directions from GANSpace [24] and SeFa [50]. The changes in expression were realized by applying the “global directions” method of StyleCLIP [45], using the prompts “angry face”, “laughing face”, “kissing face”, “sad face”, “singing face”, and “surprised face”. The differences between StyleGAN2 and Alias-Free GAN are again very prominent, with the former displaying jarring sticking of facial hair and skin texture, even under subtle movements
Second, Hollywood doesn't care about that problem. They will take the best application of the technique, and they don't care if they have to apply a few manual touchups on the result. As long as there is one way of using the system to do the sort of thing they showed in the sample, it won't matter to them that they can't embed a full video game into the neural network itself. They only care about the happy path of the tech.
Someone's probably already starting the company now to use this in special effects, or putting someone on research in an existing company.
But I do appreciate the artefacts of StyleGAN2 as an artistic choice, too.
If you ask styleGAN to generate a specific image, that's possible, but you are no longer looking at how well these models generate images.
I click the website. I search "model". I see two results. Oh no, that means no download link to model.
I go to the github. Maybe model download link is there. I see zero code: https://github.com/NVlabs/alias-free-gan
Zero code. Zero model.
You, and everyone like you, who are gushing with praise and hypnotized by pretty images and a nice-looking pdf, are doing damage by saying that this is correct and normal.
The thing that's useful to me, first and foremost, is a model. Code alone isn't useful.
Code, however, is the recipe to create the model. It might take 400 hours on a V100, and it might not actually result in the model being created, but it slightly helps me.
There is no code here.
Do you think that the pdf is helpful? Yeah, maybe. But I'm starting to suspect that the pdf is in fact a tech demo for nVidia, not a scientific contribution whose purpose is to be helpful to people like me.
Okay? Model first. Code second. Paper third.
Every time a tech demo like this comes out, I'd like you to check that those things exist, in that order. If it doesn't, it's not reproducible science. It's a tech demo.
I need to write something about this somewhere, because a large number of people seem to be caught in this spell. You're definitely not alone, and I'm sorry for sounding like I was singling you out. I just loaded up the comment section, saw your comment, thought "Oh, awesome!" clicked through, and went "Oh no..."
> I go to the github. Maybe model download link is there. I see zero code
Paper was released today. Chill. They said they will release the code in September (I'm guessing late September). The paper is also a pre-print. They're probably aiming for CVPR and don't want to get scooped.
> Model first. Code second. Paper third.
That's how you produce ML code and documentation but that is not how you release it. I guarantee you that they are still tuning and making the model better. They're were still updating ADA till pretty recently (last commit on the pytorch version is 4 months ago, to code).
I originally wasn't in CS, and when I first came over I wasn't in ML. We never had code. The fact that ML publishes models AND checkpoints is a godsend. I love it. Makes work so much easier and helps the community advance faster. I love this, but just chill. The paper isn't peer-reviewed. It is a pre-print. They're showing people what they've done in the last 6 months. It's part publicity stunt, part flex, part staking claim, but it is also part sharing with the community. Even without the code we learn a lot because they attached a paper to it. So chill.
It's also important that people understand that even if code is provided, it's commercially useless. From the NVAE license as an example[1]
> The Work and any derivative works thereof only may be used or intended for use non-commercially.
It's a great example of the difference between open source (which it is) and free software which it is not. So we're back to square one where it is probably best to clean-room the implementation from the paper, which is nearly useless to reproduce the model.
Their central improvement is that they limit the generation of high frequencies by ReLU through a upsample-ReLU-filter-downsample sequence.
Their theoretical section explains quite well why high frequencies can be proven mathematically to cause issues. And their practical implementation using filters to cut those off is very straightforward.
If someone tells you "The microphone recording had 50Hz noise so I used a filter to remove it", that's pretty much good enough for someone with experience in the field to replicate their results. This is the equivalent in AI. They uncovered a simple basic issue that everyone else overlooked, but once you know it, it seems obvious in retrospect.
Edit: the Debian Deep Learning Team's Machine Learning Policy explains why.
Hmm, I wasn't trying to nay-say anything here. I mostly agree with your original comment.
See also how in the Gan Theft Auto they are sort-of getting the light reflection for free without having to explicitly teach the network about that parts of physics.
Having the model is enough to verify the paper's claims, and also to experiment with new approaches (since you can fine-tune the model).
That said, I make this concession as a "meet you halfway" compromise between hard-line positions: "We can't release models, because we trained them on private data" and "You must release both models and data."
In other words, you're technically correct, but in my estimation it would do more harm to the end goal: the whole reason the scientific method is useful, is because it makes the world more useful.
The world would be less useful if fewer commercial companies participated in the scientific method. It's an inclusive group, not an exclusive clique. All you have to do, is give me the tools to verify your claims.
I'm not sure it is good enough for a variety of other possibly useful use-cases though; eliminating bias in an existing model, correcting a flaw in the training code, creating a different model and proving it is better by training on the same data etc.
It would be nice if there were more public/libre data sets for ML stuff.
Because it’s crucially important that we protect the scientific method here.
The sole goal is to help people like me reproduce the model. If I can’t reproduce the model, I can’t verify the paper.
When I saw “commercial” and then “open source” in your comment, I said “oh no…”
My duty is to the scientific method, so I don’t care if it’s the most restrictive code on the planet as long as I can use it to reproduce the model in the paper.
Because at that point, I have a baseline for evaluating the paper’s claims.
The reason I assume the paper is false until proven otherwise, is because the paper often doesn’t have enough detail to reproduce the model shown in the videos on this tech demos. Meaning, if they’re the it to help me, the ML researcher, then they’re failing to tell me how to evaluate their claims rigorously.
(That said, it’s breaking my heart that I can’t agree with you here, because I want to so badly. I’ve felt similarly for years that scientific contributions need to be “free as in beer” commercially. But I recognize signs of zealotry when I see them, and I can’t let my personal views creep in, because people like me would stop listening if I was here e.g. arguing vehemently that nVidia needed to be delivering us something commercially viable along with a high quality codebase. The price for entry to the scientific method isn’t so high.)
It's not just a knowledge for knowledge's sake issue here, it's that it's not even knowledge they're publishing. They're publishing nothing.
They would make a license that says the code can only be provided for peer review and counter validation, then that'd be knowledge. Then, the sake of it is another secondary problem.
The template screams NeurIPS though. Page limit for that would be 9 pages, this is 9.5, they might have started adding things after the first deadline, anticipating an extra page for camera ready?
I mean, that's just a bit of paper astrology of course. But if I'm right, then the author notification is September 28 and camera ready will be due in October, assuming it is accepted. So in that case releasing code (end of) September makes sense.
Edit: regardless of the (good) work NVidia have been doing over the last years, there is an issue here about big teams breaking the blind review process by putting themselves on the front of not just HN, but by now probably also the relevant twitter, fb, reddit pages. They know full-well that a release by NVidia will gain attention, and by the time review really gets started it's very likely any reviewer in their field will know exactly who they're reviewing.
That's a fair point and I'm not sure why I didn't consider that they would release a pre-print after they had submitted it. (This is a total fumble on my part)
> there is an issue here about big teams breaking the blind review process by putting themselves on the front
I don't see that as actually breaking the blind review part. There are many more abuses that de-anonymize themselves. Most transformer research is done by big labs because they need the processing power and are the only ones who can afford such equipment (though there was a paper that did transformers on CPUs). Just training ImageNet is out of bounds for a lot of people (I have a few A6000s and it still takes me days). A trivial example is that Google will use JFT and will include it everywhere. If you're qualified to review you're probably going to be able to de-anonymize the lab. I do think we need to do more to make a more level playing field but that's an extremely difficult thing to do. More resources just enables you do do more. But maybe we shouldn't metric hack as much, which would slow things down a little.
None of what you said is responsive to what I wrote. I think it's an opinion piece, but I'm not sure.
The issue here is the scientific method. I've listed the things that are required, as I see it. And I've also listed the reasons why I haven't been able to verify it exists here, despite trying for two years.
I'm glad that you like ML hacking, and I like it too. But models aren't a godsend; they're "the most basic, bare-minimum requirements of reproducibility."
Your reaction shouldn't be "I'm incredibly grateful you'd be willing to do this." It should be "You're required to do this, because if I can't verify your claims, your claims might be mistaken."
To leave it off on a softer note, normally I'd bond with you, ML hacker to ML hacker. Because I love ML, and I love hearing what you've been up to in ML. It's the best job in the world, as far as I'm concerned. (Could any other career give you the opportunity to be a developer advocate for high-performance computing in such an interesting way? https://github.com/google/jax/issues/2108#issuecomment-86623... Definitely looking for more examples of "Github Larping," if you know of any.)
If you agree that the scientific method is the reason ML moves forward, all I'm doing here is protecting it.
The scientific method is being followed here. Code is not needed for the scientific model to be followed. Even data. Literally every other field is able to advance without public code or data (in fact most areas of CS). There's absolutely no reason to believe that they won't release their code. They have a history of doing so. Models and checkpoints are not the bare-minimum for reproducibility. They describe their model enough in the paper. There's enough written in the paper (which is 30 pages) to reproduce the model. Will it be easy? No. But it can be done. And to be clear, I'm saying that the status quo of code being released is a godsend. This is not the norm in literally every other field/subfield. Code helps with reproducibility (and so should be encouraged) but is not required.
If you require someone else's code to reproduce results then you're not convincing me you're a good ML researcher nor programmer.
I retract my claims. You're right. Thanks for calling me out.
I will say that it's... a gargantuan effort to do the things that you're proposing. But as someone who did them you're right, you can. (BigGAN-Deep took a year to track down the bug https://github.com/google/compare_gan/issues/54)
BigGAN-Deep is a decent example of the thing I was really worried about: replication. I thought it'd be really easy to "just implement the paper." But no one had. Mooch did, but not at the same scale as the DeepMind release.
Maybe you're right about me, too. You're convincing me that I'm not a very good ML programmer. It's probably best to bow out on whatever high notes I've achieved.
Karras' work is fantastic. I don't know why this preview of things to come was where I chose to do this. Thank you, nVidia group, for working so hard.
I call bullshit. In computer science, not releasing the code of an algorithm whose output you describe is akin to maliciously obfuscating your methods. No serious paper should be accepted without a script to reproduce the exact same results again.
That said, I agree with your overall position on ML publications. So much of what we see is a tech demo protected by some kind of moat, either a private commercial dataset or insatiable processing requirements or missing code or a combination of the above. These aren’t science, they’re advertisements.
Then this is not a scientific contribution yet.
We must wait and see.
The most important tenet of science, is to doubt. I didn’t even read the name on the paper before I wrote my comment. Yes, I know this group. They’re why I got into ML, along with the group from OpenAI who published GPT-2. Because A+ science.
Their claims here are likely wrong unless and until proven otherwise. This isn’t a hardline position. It’s been my experience across many codebases, during my two years of trying to reproduce many ideas.
I agree that that is an example of A+ science. But why do you think they’re punishing this now, today? Either because conference deadline or because nVidia pressure. Neither of those are related to helping me achieve the scientific method: reproducing the idea in the paper, to verify their claims.
All I can do is kind of try to reverse engineer some vague claims in a pdf, without those things.
--
Let me tell you a little bit about my job, because my time with my job may soon come to an end. I think that might clear up some confusion.
My job, as an ML researcher, is to learn techniques that may or may not be true, combine them in novel ways, and present results to others.
Knowledge, Contribution, Presentation, in that order.
The first step is to obtain knowledge. Let's set aside the question of why, because why is a question for me personally, which is unrelated.
Scientific knowledge comes when Knowledge, Contribution, and Presentation are all achieved in a rigorous way. The rigor allows people like me to verify that I have knowledge. Without this, I have mistaken knowledge, which is worse than useless. It's an illusion – I'm fooling myself.
When I got into ML two years ago, I thought that knowledge would come from reading scientific papers. I was wrong.
Most papers, are wrong. That's been my experience for the past two years. My experience may be wrong. Maybe others obtain rigorous scientific knowledge through the paper alone.
But researchers happen to obtain a dangerous thing: prestige. Unfortunately, prestige doesn't come from helping others obtain knowledge. It comes from that last step -- presentation.
The presentation on this thread is excellent. It's another Karras release. I agree; there's no reason to doubt they'll be just as rigorous with this release as they are with stylegan2.
But knowledge doesn't come from presentation. Only prestige.
Prestige makes a lot of new researchers try very hard to obtain the wrong things.
If all of these were small concerns, or curious quirks, they'd be a footnote in my field guide. But I submit that these things are front and center to the current state of affairs in 2021. Every time a release like this happens, it generates a lot of fanfare and we come together in celebration because ML Is Happening, Yay!
And then I try to obtain the Knowledge in the fanfare, and discover that either it's absent or mistaken. Because there are no tools for me to verify their claims -- and when I do, I often see that they don't work!
That's right. I kept finding out that these things being claimed, just aren't true. No matter how enticing the claim is, or whether it sounds like "Foobars are Aligned in the Convolution Digit," the claim, from where I was sitting, seemed to be wrong. It contained mistaken knowledge -- worse than useless.
Unfortunately, two years with no salary takes a toll. I could spend another few years doing this if I wanted to. But I wound up so disgusted with discovering that we're all just chasing prestige, not knowledge, that I'd rather ship production-grade software for the world's most boring commercial work, as long as the work seems useful and the team seems interesting. Because at least I'd be doing something useful.
I'd be very interested in your thoughts on that position, because if it's mistaken, I shouldn't be saying it. It represents whatever small contribution I can make to fellow new ML researchers, which is roughly: "watch out."
In short, for two years, I kept trying to implement stated claims -- to reproduce them in exactly the way you say here -- and they simply didn't work as stated.
It might sound confusing that the claims were "simply wrong" or "didn't work." But every time I tried, achieving anything remotely close to "success" was the exception, not the norm.
And I don't think it was because I failed to implement what they were saying in the paper. I agree that that's the most likely thing. But I was careful. It's very easy to make mistakes, and I tried to make none, as both someone with over a decade of experience (https://shawnpresser.blogspot.com/) and someone who cares deeply about the things I'm talking about here.
It takes hard work to reproduce the technique the way you're saying. I put all my heart and soul into trying to. And I kept getting dismayed, because people kept trying to convince me of things that either I couldn't verify (because verification is extremely hard, as you well know) or were simply wrong.
So if I sound entitled, I agree. When I got into this job, as an ML researcher, I thought I was entitled to the scientific method. Or anything vaguely resembling "careful, distilled, correct knowledge that I can build on."
There are always assumptions. At least with public code and models those assumptions are laid bare for all to see and potentially expose any bad assumptions.
To his defense, the spirit of his rant was valid, the letter made it sound entitled.
I'm in the middle of a PhD and this is always an issue. It takes awhile to learn how to read papers and to gather enough background knowledge that you can read between the lines (publications are limited, you can't put everything in a paper. This is why having code is so great, it accelerates the process). You're two years into your journey, this is often when things _start_ turning the other direction. There's a reason PhDs take so long, and that's with experts (hopefully) helping you learn how to read papers, telling you which papers to read (which is a challenge in of itself), having the ability to spend full time on learning, and learning how to build background knowledge on a subject while learning the state of the art. There's a reason ML pays the big bucks. It takes a long time to learn/gather expertise, it is fucking difficult, and it has direct applications that can lead to useful products today (a big component of why you get paid big bucks). It is also easy to lose track of your progress. I remember the first research paper I read was complete gibberish to me. I'm 3 years into my PhD and now I can understand papers in my niche. But for a long time a lot of stuff didn't click. This is normal. It takes time to learn and 2 years isn't that much (especially when you have a full time job). Making contributions in your first year of a PhD is atypical, even in your second year. It only happens at top universities where people have a lot of help and resources.
Research it hard. It takes years to become an expert and learn how to read papers. Don't give up, but calm down and recognize that given more time things will make more sense.
The bar isn't low, this is just a pre-print.
Well tell that to my advisor (it's also something I've done in the past). So my experience doesn't reflect your claim.
> No serious paper should be accepted without a script to reproduce the exact same results again.
You do realize that this is a pre-print, right? If it went to NeurlIPS then they did release the code to them and will release the code to the public later.
And I'm not trying to say you suck. But you said you've been studying the subject for only 2 years. So I am going to check you. It's easy to grow an ego, but it often isn't useful. Sucking at something is the first step to being somewhat good at something. And you're clearly past the step of "sucking" but not to the step of "wizard." I don't know where you are between there tbh. But I do understand the frustration haha. That is normal.
Side note: usually it is good practice to note that you edited comments. It was rather confusing to look back and see something different.
i spent years trying to understand how it can be that some software that was supposed to corresponds to the well cited paper did not even come close to reproducing the results of the said paper,
This was my exact experience. I didn’t understand why I kept having it, and kept blaming myself for not being careful enough. My code must be wrong, or the data, or something.
Nah. It was the idea.
Kept feeling like a kick in the gut, until here we are today, when I’m warning everyone that Karras, of all people, might publish such a thing.
I really appreciate that you posted this, because I’m so happy I wasn’t alone in the feeling of “what’s going on, here…?”
The replication crisis in psychology threw out 50% or so of supposed scientific results.
If this (or just straight fraud) is common elsewhere, it seems like knowing about that would be a good thing for science.
Fwiw I think actual knowledge is there in the ML literature, but it's not in these Benchmark-chasing highly tuned papers. It's more high level stuff, like basic architecture building blocks etc. GANs and Transformers for example. They undeniably work, and the knowledge needed to implement them can probably be conveyed in a few pages maximum. No need for an implementation to be provided by the author, really.
Why should graduate students have to spend years trying to reproduce stuff that turns out to be no good? Nobody should have to put up with getting their time wasted like that.
[0] https://arxiv.org/abs/1812.04948
In at least some big companies in the private sector we have “blameless postmortems” where we describe what went wrong in an operational failure without blaming the participating employees.
Maybe I am wrong though, and a better culture is possible, like the shift to preprints has happened in a lot of fields and was probably previously unthinkable. So good on you for taking an idealistic stance, I am probably just being grumpy. That being said, whatever culture changes may be beneficial, I stand by my original point that simply dumping code and model alongside the paper is not unambiguously good and may even obscure problems.