Linux kernel maintainer says no to AMDGPU patch(lists.freedesktop.org) |
Linux kernel maintainer says no to AMDGPU patch(lists.freedesktop.org) |
> Here's the thing, we want AMD to join the graphics community not hang out inside the company in silos. We need to enable FreeSync on Linux, go ask the community how would be best to do it, don't shove it inside the driver hidden in a special ioctl. Got some new HDMI features that are secret, talk to other ppl in the same position and work out a plan for moving forward. At the moment there is no engaging with the Linux stack because you aren't really using it, as long as you hide behind the abstraction there won't be much engagement, and neither side benefits, so why should we merge the code if nobody benefits?
> The platform problem/Windows mindset is scary and makes a lot of decisions for you, open source doesn't have those restrictions, and I don't accept drivers that try and push those development model problems into our codebase.
They provide a standard implemented by the driver and not the hardware. There is not even a standard to get performance metrics for GFX cards. Nothing.
I agree with Dave. If you do not want to create the standard, leave others to do it. But having an HAL inside the driver is problematic.
Shall we have a cross-platform standard for writing cross-platform drivers? Write once, run everywhere? Why not as long as it is open source.
But it still needs someone to govern it, like linux kernel project and device companies do not seem interested. Which says a lot for their intentions.
> all sorts of "linux'isms" from code daily and deal with the pain of porting non portable Linux code to their platform.
If you're developing for Linux, using Linux specific technology, then of course there would be porting effort required.
Same as if you want to make you Windows stuff work on Linux, there should be porting required - after all, it's a different platform,
What AMD wants to do is to sidestep as much of the porting as possible, by effectively shipping their Windows code inside the Linux kernel.
Your case is about porting code between different OS kernels.
i'm definitely not mad and bitter about Linux in any way, grasping at straws in an attempt for relevance. i promise you. pinky swear.
DA instead is saying to the developers that they need to play ball and work with the existing Linux DRI world and not silo themselves off.
Oh please.
No, the boss said "merge this" and the code has been developed "corporate style" (with a HAL, etc) now the "mean" kernel developers won't approve this
But the Kernel people are right, because the other option would be to introduce code that breaks every now and then and is unmaintainable. See all the ACPI issues for example, that only stopped when Linus said "no changes can break existing functionality anymore"
Lack of profanities already exceeded the LKML reputation.
I think the point about rules being applied consistently is very true. If Alice does the work to comply then Bob shouldn't be able to get away without doing it just because he's bigger.
Please prefer the term "Digital Restriction Management". :-)
AMD tried the same in their open source driver and were rejected by the kernel maintainer. Unified drivers have code sharing advantages but don't follow the practices of the linux kernel.
Edit: Here's the start of the thread https://lists.freedesktop.org/archives/dri-devel/2016-Decemb...
And definitely much better than my only other inkling for what that meant in relation to computer systems.[2]
Well, at least for the DRM (Direct Rendering Manager) subsystem.
One of these is cloud computing on large clusters of headless machines using the parallelization that GPUs are known for. If you want to do this right you definitely need input from a lot of sources, not just hacks in AMD delivered code.
The No men is all that stands between us and the Yes men.
Praise and salutations to the No men, God bless you.
("Do users do X often?" isn't the question; "do they get annoyed when they can't?" is the question, and hardcore gamers tend to have one computer for gaming and oftentimes other computers for other stuff; if they were even using Linux on those it'd be a paradigm shift)
I think what you mean is "I get the idealism but you also need to be realistic." It's not pragmatic to stand your guns and ask a multi-million dollar company to change the code they submit to your open-source project.
And before anyone mentions Android and ChromeOS, Google can replace the kernel and only OEMs writing drivers, most of them closed, would notice.
Nothing to do with Windows and OSX being bundled with the hardware /s
So this rejection is about maintainership that negatively affects distribution of the amdgpu module as a side effect. It's nothing that can't be solved by linux distributions though.
We propose to use the Display Core (DC) driver for display support on AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to avoid a flag day the plan is to only support uGPU initially and transition to older ASICs gradually.
The DC component has received extensive testing within AMD for DCE8, 10, and 11 GPUs and is being prepared for uGPU. Support should be better than amdgpu's current display support.
I mean, it is all GPL, so it's perfectly okay. Is it too much for some dev in seek of fame to do this?
This is me talking with my community hat on (not my Intel maintainer hat), and with that hat on my overall goal is always to build a strong community so that in the future open source gfx wins everywhere, and everyone can have good drivers with source-code. Anyway:
- "Why not merge through staging?" Staging is a ghetto, separate from the main dri-devel discussions. We've merged a few drivers through staging, it's a pain, and if your goal is to build a strong cross-vendor community and foster good collaboration between different teams to share code and bugfixes and ideas then staging is fail. We've merged about 20 atomic modeset drivers in the past 2 years, non of them went through staging.
- "Typing code twice doesn't make sense, why do you reject this?" Agreed, but there's fundamentally two ways to share code in drivers. One is you add a masive HAL to abstract away the differences between all the places you want your driver to run in. The other is that you build a helper library that programs different parts of your hw, and then you have a (fairly minimal) OS-specific piece of glue that binds it together in a way that's best for each OS. Simplifying things of course here, but the big lesson in Linux device drivers (not just drm) is that HAL is pain, and the little bit of additional unshared code that the helper library code requires gives you massive benefits. Upstream doesn't ask AMD to not share code, it's only the specific code sharing design that DAL/DC implements which isn't good.
- "Why do you expect perfect code before merging?" We don't, I think compard to most other parts in the kernel DRM is rather lenient in accepting good enough code - we know that somewhat bad code today is much more useful than perfect code 2 years down the road, simply because in 2 years no one gives a shit about your outdated gpu any more. But the goal is always to make the community stronger, and like Dave explains in his follow up, merging code that hurts effective collaboration is likely an overall (for the community, not individual vendors) loss and not worth it.
- "Why not fix up post-merge?" Perfectly reasonable plan, and often what we do. See above for why we tend to except not-yet-perfect code rather often. But doing that only makes sense when thing will move forward soon&fast, and for better or worse the DAL team is hidden behind that massive abstraction layer. And I've seen a lot of these, and if there's not massive pressure to fix up th problem it tends to get postponed forever since demidlayering a driver or subsystem is very hard work. We have some midlayer/abstraction layer issues dating back from the first drm drivers 15 years ago in the drm core, and it took over 5 years to clean up that mess. For a grand total of about 10k lines of code. Merging DAL as-is pretty much guarantees it'll never get fixed until the driver is forked once more.
- "Why don't you just talk and reach some sort of agreement?" There's lots of talking going on, it's just that most of it happens in private because things are complicated, and it's never easy to do such big course correction with big projects like AMD's DAL/DC efforts.
- "Why do you open source hippies hate AMD so much?" We don't, everyone wants to get AMD on board with upstream and be able to treat open-source gfx drivers as a first class citizen within AMD (stuff like using it to validate and power-on hardware is what will make the difference between "Linux kinda runs" and "Linux runs as good or better than any other OS"). But doing things the open source way is completely different from how companies tend to do things traditinoally (note: just different, not better or worse!), and if you drag lots of engineers and teams and managers into upstream the learning experience tends to be painful for everyone and take years. We'll all get there eventually, but it's not going to happen in a few days. It's just unfortunate that things are a bit ugly while that's going on, but looking at any other company that tries to do large-scale open-source efforts, especially hw teams, it's the same story, e.g. see what IBM is trying to pull off with open power.
Hope that sheds some more light onto all this and calms everyone down ;-)
It will be time consuming to change now.
https://lists.freedesktop.org/archives/dri-devel/2016-Februa...
> Cleaning up that is not enough, abstracting kernel API like kmalloc or i2c, or similar, is a no go. If the current drm infrastructure does not suit your need then you need to work on improving it to suit your need. You can not develop a whole drm layer inside your code and expect to upstream it.
https://lists.freedesktop.org/archives/dri-devel/2016-Februa...
> Cleaning up that is not enough, abstracting kernel API like kmalloc or i2c, or similar, is a no go. If the current drm infrastructure does not suit your need then you need to work on improving it to suit your need. You can not redevelop a whole drm layer inside your code and expect to upstream it.
> Linux device driver are about sharing infrastructure and trying to expose it through common API to userspace.
> So i strongly suggest that you start thinking on how to change the drm API to suit your need and to start discussions about those changes. If you need them then they will likely be usefull to others down the road.
https://lists.freedesktop.org/archives/dri-devel/2016-Februa...
Result is no people to do the required work.
So they wanted to get rid of the ugly kernel part that was a PITA to install and update and are now pushing a lot of their hardware abstraction code from Windows into kernel patches so the AMDGPU driver can just talk to the Windows blob pretty much verbatim (which is now AMDGPU Pro).
The practical effect is a continuation of the status quo. AMDGPU Pro, without a lot of this functionality, is either broken or underperforming across all distros. It is still better than what the last FGLRX was, but nowadays the Gallium free driver they also develop is beating the blob in almost everything except the latest driver-level optimized games.
Most distros have completely dropped all proprietary AMD support. Going forward, it will be up to them to ship a proprietary driver and maintain installation faculties pretty much everywhere. AMDGPU with Mesa is going to continue working fine, new GPUs are getting supported still, and a lot of what this HAL does (display / window management) has had usable support that has worked for years in various parts of Gallium / Mesa / DRM.
The optimistic future is that AMD drops AMDGPU Pro, refocuses developer effort on AMDGPU / Gallium, and works with the rest of the Linux graphics community to implement Freesync / Trueaudio / whatever other tech AMD has buzzwords on into shared kernel code rather than trying to stick it in a HAL from their Windows driver.
The pessimistic view has AMD just fire or reassign a lot of its Linux staff, leaving its hardware on the platform to wilt. It would never stop entirely, AMD provides programming manuals for their hardware and most of their ASM in new platforms to enable almost anyone to program their GPUs (unlike Nvidia, who publishes nothing, requiring devs to reverse engineer their hardware and ASM) so the support would still be better than Nouveau.
From that perspective, saying that "the optimistic future is that AMD drops AMDGPU Pro" is a bit silly, since it's largely the same code base as AMDGPU and the same people working on both.
It also makes no sense at all to say that "a lot of what this HAL does has had usable support [...] in various parts of Gallium / Mesa", since Gallium and Mesa are purely concerned with rendering and video. They don't care about display. In fact, you can actually use radeonsi and the rest of the open-source stack on top of the amdgpu-pro kernel module. (And for that matter, the closed source Vulkan driver is supposed to be compatible with an otherwise open source stack.)
Also, AMDGPU is not necessarily fine going forward, precisely because of this display code issue. Yes, the memory management and rendering/video engine parts are going to be just fine, but that won't do you a lot of good (outside of compute purposes) if you can't light up a display...
I have run desktop Linux across a dozen (maybe slightly more) machines over a decade, and friends will ask me for advice stepping into that world. On graphics drivers, my safest recommendation has always been:
- If AMD, use the open source version.
- If Nvidia, use proprietary.
- If Intel integrated, thank whatever god you believe in for Mesa.
What about Nvidia's GPUs, community relations or [insert other topic] hobbles their open source one so thoroughly compared to AMD? Or alternatively: Why is AMD's (presumably) deeper knowledge of their graphics hardware unable to be more stable than the open source equivalent when Nvidia's is?
Nvidia support for Linux is hard to get as well. Now AMD support is also hard to get.
Hardware OEM companies make Windows drivers, and don't seem to care about Linux. This is the same thing that happened to IBM and OS/2 not good third party driver support.
There are open source drivers that work, but are not as fast as the proprietary drivers.
Linux needs better display drivers and for that OpenCL or Vulkan support as well. Windows uses dotnet and DirectX for games.
Not necessarily. It not being merged into mainline Linux doesn't mean that other parties can't include it with their own distros.
Read an article or something.
Source: My desktop PC has an R9 Nano, and I've used the amdgpu driver since Linux 4.5 (when it good power management support for my card and thus could enable usual GPU clock speeds).
Now to be fair, one merit of Linus is that he is direct and says he doesn't like something, and not weasel language like "maybe later" or other time-wasting expressions that force people to guess what he is thinking.
Linux is open source, so if the kernel developers desire better designed code, they are free to change the code up to their quality levels. If the kernel development team does not have the manpower for this, they should better think about a way to maintain the kernel that involves less work. One example (among many) would be to think about a way to keep the internal kernel interfaces typically stable over many years so that only rarely there is a lot work to be done for updating all the drivers to the new internal kernel interfaces.
> If you're developing for Linux, using Linux specific technology, then of course there would be porting effort required.
The released open source drivers seem to work quite well (as they do on Windows). The problem is that they don't fit the taste of the kernel developers.
Open source does not mean, "any code accepted here!"
> if the kernel developers desire better designed code, they are free to change the code up to their quality levels.
They are also free to reject bad code and demand that if you want you code in, you should improve it.
I don't get this mentality at all; why should the kernel developers accept inferior code and then improve it? Isn't that the responsibility of the vendor who designed the product? After all, AMD is a for profit company, not a charity. Why should the other developers provide charity to make AMD code better? So that AMD can sell more units or have better PR? What?
>The problem is that they don't fit the taste of the kernel developers.
That's not the problem.
The problem is that the drivers were designed for Windows, not Linux and such code is not suitable for inclusion in the Linux kernel.
If anything, that post highlights the lack of quality among the amd driver team, and doesn't have to do much with 'taste'.
Or, they could have AMD do the work, since apparently they're the ones who didn't listen -- after being told months ago -- that this rejection could happen when they tried to include a HAL with their driver. I think that perfectly works: the kernel developers don't have to "do all that work" involving un-fucking AMD's driver, and AMD instead has to do the work. Sounds good to me.
There's literally 0 point in accepting the code as-is, because everyone would be on the hook for maintenance it in the mean time, while it got un-crappified, and it would make graphics subsystem maintainers life worse. No. Kick it out, make them do it the right way, and when they come back -- they can talk.
This whole thread has got plenty of of entitled whiners and people with a bone-to-pick against Linux, like yourself, bitching,about OSS maintainers not making their own lives harder because you want feel good about your graphics driver. Get over it. Or, get involved -- maybe you could send a few patches to AMD maintainers to clean out some of the crap.
AMD is not being banned from kernel development, but they're going to have to do it right. Just like the other 20-30 companies that regularly contribute upstream to Linux with their money/developers.
In fact, Linux often accepts drivers for hardware that only certain companies have access to and are of no use to anyone else in the general public. Why? Because they played by the rules, meaning the overall maintenance cost to include those drivers becomes far smaller in the long run. The cost of including the AMDGPU driver, as it is now, is astronomically high in comparison.
> The released open source drivers seem to work quite well (as they do on Windows). The problem is that they don't fit the taste of the kernel developers.
No. Let's be clear: the problem is AMD can't listen, and people like you apparently, can't read. That's about all it comes down to. You are free to now whine and complain about how incredibly important this is and how it's definitely worth breaking the rules over and how much it means to you, and I'm sure the kernel developers owe you this feature, or something (after all, developers are just robots with no lives and if they don't work hard enough, they're bad.)
Meanwhile, it will be ignored, Linux will go on (and continue to crush its competitors in the spaces it matters in), and the world will still turn. And maybe in a year from now AMD will actually have something worth merging. In the mean time, Nvidia will continue to dominate them in the compute market. Maybe actually listening 9 months ago would have saved them some time and market share.
NVIDIA is about the furthest thing from a shining example of good behavior in the Linux kernel. By not having open source drivers at all, they're far worse.
(By the way, their closed source driver has a HAL too.)
But abstracting all this is fitting the infrastructure to their needs (which is having a common driver infrastructure for many operating systems).
Kernel development is largely an optimization process of the internal model of what a kernel should do, and for that developers need first-class raw data, the DRM maintainer wants to know how you'd like to change DRM. And though you can put that into a translation/abstraction layer, it's not helpful, because that doesn't scale, because the maintainer would have to look at every such layers and come up with a common one, and repeat this grueling task every time they want to move forard with DRM itself.
AMD provides public documentation for how the driver should interact with their GPUs, Nvidia does not.
Oh, and AMD employs some of the people developing the open Radeon and AMDGPU (non-pro) drivers, while AFAIK Nouveau is a pure community project.
This depends on whether there is at least 1 person at AMD who has experience with kernel development. If all of the team members all kernel outsiders, then this might confuse/surprise them. Otherwise someone likely raised this as a potential issue.
They didn't listen.
In the end, it kind of sucks, but was the right thing to do.
If AMD decided to simply leave their driver as closed-source blob, they would not have this problem. But all the Linux fanboys that they want AMD to open source their graphics drivers. Just to state one thing clear: AMD already released specifications beforehand such that the kernel developers could have developed an independent graphics driver if they wanted. But it is better to shitstorm companies not to develop a driver than to sully one's hands.
AMD was nice and did its job. But instead of being satisfied with the result they now want AMD to start playing the political game with them (for the protocol: IMHO the correct thing to do would be to bring the driver to the staging area and now it's part of the kernel developers to sully their hands to bring the part up to the standards they want).
NVidia simply refuses to open source their drivers and does not get into this kind of trouble. Lesson learned: Never negotiate with terrorists.
You already lost the argument, if you have to go with that, but I'll try my best to explain anyway; what the kernel developers want is for people to use the standard, already maintained interfaces, instead of companies developing their own and adding unnecessary complexity to the kernel, which would then need to be maintained by someone for a long time.
AMD is basically writing an abstraction layer to allow them to use their Windows driver code, the Linux maintainers are saying; why should we have an "inferior" driver, that's basically "ported" from Windows? If you want to support Linux, write code that interacts with standard Linux interfaces, follows our conventions and benefits the community as a whole.
> Lesson learned: Never negotiate with terrorists.
So not wanting to merge shitty* code into your codebase is now terrorism?
As I wrote: AMD already indulged a lot (first specifications, then even an open source Linux driver). Even after the first step the kernel developers would be able to write a Linux drivers(though it is a lot of work). I already clarified: AMD did all this to satisfy what the Linux fanboys wanted (while NVidia did nothing). But instead of saying "thanks for all the work you did, AMD. The driver is currently not up to the standards that we desire, but it still helped us to make the driver development a lot less work. We [kernel developers] will do the remaining job and lift the kernel up to the superior quality that we want.". But that is not what the kernel developers did. Instead they want AMD to dance to the kernel developer's piping.
OTOH Linux has some extremely useful syscalls that others just don't have, sometimes causing major performance regressions. Case in point: sync_file_range.
But then Linux also has a history of screwing up and having a bunch of different syscalls on different platforms because somebody introducing a syscall didn't quite think it through. Case in point: sync_file_range. This "only" causes extra work for developers directly working with syscalls - so usually libc devs.
BSD is BSD
Windows also has different sys calls than both Linux and BSD, that's because it's a different OS.
Yes, there's POSIX in there, but generally it is its own OS and you need to treat is as such, same as Windows really.
Yes, Linux, Mac OSX and BSDs have their differences.
But in the kernel, code is going to (or even has to) be less generic
Heh.
I get the same feeling about git when I'm working with hg, the Canada[1] of git. For svn, darcs, bzr, and Mercurial, "revert" means to change the working directory to the repo state. Why did git use it to mean creating a new commit that undoes an old commit?
There's no use rallying against the majority software when it doesn't follow the standard. When you're the majority, you just get to make your own standards because you are the standard. BSD being the Canada of Linux, it's a little funny to hear them complaining about things that Linux does.
--
[1] http://itre.cis.upenn.edu/~myl/languagelog/archives/005497.h...
It seems to be an argument of: we do this for all drivers, vs amd pushing: we want to be the exception for this driver only.
Not that tiny hypervisors and stuff aren't cool, though.
There's a lot of people that care about gaming, even if you don't.
The gaming and demoscene cultures don't care 1 second how much their tools cost, the openess of hardware and software tooling, rather the achieved results and getting their stuff on the hands of users, regardless how.
The GNU/Linux culture is all about the ideology of having stuff for free, replicating a desktop experience as if CDE was the epitome of UX, fulled with xterms.
Of course I am generalising and might get tons of counter examples, just noting my personal experience regarding friends and co-workers.
Uh, no. It's about having the freedom to fix, improve, or otherwise modify the software you use. Being free-as-in-beer happens to be a requirement for that, but it isn't the goal. Think about it this way: free software developers get paid to do work, instead of getting paid for having done work like proprietary software developers. You pay me to implement feature X, which is then released to the world for further improvement in the future.
Yeah, no. Having free stuff (as in speech), yeah. Having stuff for free ? That's not the UNIX culture...
In my opinion the existing AMDGPU code isn't exactly spectacular as it is. They barely have comments or commit messages and there's a ton of duplication. Linux has issues with keeping driver contributions up to snuff as it is, without enormous vendor-specific HALs everywhere.
What I see here is (sadly, again) two groups of developers unwilling to meet halfway and understanding each others problems. Expecting AMD to support a completely separate driver just for Linux is unrealistic. Expecting a 100kLOC code dump do be accepted is unrealistic as well. I don't see anyone talking about how to get over this hurdle on lkml, I just see the single least constructive word: "No."
Meanwhile, 3D support in Linux will remain a crappy tire fire which works well only if you use a completely proprietary nVidia driver.
-Dave Airlie, in TFA.
1. There is absolutely value in rejecting bad, or even good but unmaintainable code from your codebase. How is this even an argument?
2. The 'devs' don't just all meet and then decide to blow each other off anyway, AMD is simply in a position with Steam where they want it to "just work" for most games at the lowest investment cost possible. They took a gamble and lost.
3. A updated proprietary driver is not ideal, but works better than making the OS worse. Again, not sure you can really disagree.
The maintainer explains the pragmatism explicitly:
> AMD can't threaten not to support new GPUs in upstream kernels without merging this, that is totally something you can do, and here's the thing Linux will survive, we'll piss off a bunch of people, but the Linux kernel will just keep on rolling forward, maybe at some point someone will get pissed about lacking upstream support for your HW and go write support and submit it, maybe they won't. The kernel is bigger than any of us and has standards about what is acceptable
Rejecting half-assed patches is pretty pragmatic, no matter who the author is. Maintaining standards is pragmatic because 'your open-source project' is the one that will be maintaining (refactoring/rewriting) the code in the future, not the muliti-million dollar company.
That has basically been Linus' Torvalds job for the last 20 years. People want to contribute to the Linux kernel to get support for the thing that they are interested in, but often the code that they are offering should not be accepted as-is, because it will make Linux as a whole that bit worse. See DBus for an example where clever people strongly put forward useful functionality, and got push-back. The end result was that they went back to the drawing board, and designed something better.
AIUI, the reason that the AMD and nVidia proprietary graphics drivers are a terrifying mass of hacks on top of hacks is trying to say yes to everything. Years later, the vendors can only move forward by setting fire to the whole lot.
All of these people pretty much contribute their free time to it.
If they can't make basic architectural decisions that improve the worst kinds of work (driver authorship and maintenance is awful drudgery) how can you expect them to feel any kind of ownership over their fate?
You're asking unpaid people to do the work people get paid for. Even worse, when this work just gets dumped on those unpaid people by people who are paid quite well.
[0] 4.5 Development statistics https://lwn.net/Articles/679289/. Just google lwn Development statistics for more.
Oh but it is in their employers' interest. It's the price of admission for mainline. And if they want in on mainline, wether simply to harvest PR or to net a contract that demands a mainlined kernel; they have to pay it. Just another case of the well-known "cost of doing business".
In the case of AMD I believe they want to reap the benefits of mainline (that is, not having to support the breakage that comes with being out of tree) and to be able to compete better with Nvidia; since AMD is unlikely to ever develop an OpenGL implementation as good as theirs but Nvidia cannot or is unlikely to be able to open source their driver.
I'd be willing to bet that most people really only want windows to appear quickly, scrolling in web pages to work well, and to watch videos online.
Lots of casual gaming has moved to mobile, and never left the consoles. Hardcore gaming -- not most people.
In any case, why are you updating the kernel version every month?
They're sending the message that AMD would be better off shipping closed source drivers, like Nvidia, because that would save them the headache of trying to get their merely good but not perfect open source drivers into the hands of users via dealing with Kernel politics.
But then again, what else is new.
And no, I don't want just "good" code in kernel I am using. This is not business. Make it maintainable so it can get better (perfect) in the long run. (yes, I know there are places in kernel where code is not even good, let alone perfect, but that's another issue altogether)
And you really think continuing with the hacky reverse engineering is the better solution?
It's a net loss for users of AMD hardware (though AMD's hardware on Linux has been a dumpster fire for as long as I can remember) because they have to suffer arguably worse driver support, but that's on AMD for wanting to have their cake and eat it too
AMD used to have much a much better hardware story for compute. I don't know how it looks now, but nVidia have absolutely stolen the market due, in part, to their excellent software -- even on Linux.
My only experience of compute on AMD was a FirePro v7900 -- an expensive, workstation-class card. With both the latest, and the 'workstation-class' Catalyst Linux drivers, my LuxMark tests came out very vast, but very red.
With nVidia, I can simply add a repo to my Ubuntu machine and have the latest stable drivers every time I do a dist-upgrade. If I want a solid, tested CUDA dev environment, I can install the CUDA repo and do likewise.
AMD have to make sure the end-user experience with these cards is as smooth as that, and that everything works.
I really hope AMDGPU-PRO is that experience. I haven't tried it yet, so I can't comment.
It's easy to dismiss enthusiasts / hobbyists / developers / gamers as a 'small market'. There was no 'pro gamer' market until ~10 years ago. Now there are entire companies built off the back of it. AMD cannot continue to leave a sour taste in end-users' mouths, otherwise there might not be any left soon.
AMD it seems won't be able to do that now.
Yes they can, they'll just have to commit resources to keeping up with kernel changes instead of having it done for them upstream. You can't have your cake and eat it as well.
It is a lose-lose situation for a developer. If I write open source software, people will demand more and more from me. If I say, "screw this, I'm just gonna release a blob." they will ridicule me while ignoring the fact that I am releasing the blob only because they pushed me to. /rant
Video tearing has been a constant problem if you use any type of compositer like Compton or the one that comes with XFCE or GNOME. I tried it on various systems and the tearing is there. A lot of people don't seem to mind though. For some reason, the Ubuntu maintainers don't think my hardware (or rather all laptops) should have the capability to hibernate to disk so they disable the /sys/disk (Im not sure I got the correct filename) which enables suspend to disk (this is one of the reasons why I need to use mainline anyway). PulseAudio doesn't play nice with DACs, ALSA is a pain to set up.
I really like Linux (so much that I keep 'ricing' my system) but these are the kinds of things that I'd rather not spend my time on.
Do you know who the elusive kernel developers that you keep referring to are? They're mostly employees of various other companies who want their code merged into the kernel and have to follow the same standards to get their code in!
Why should AMD be any different? Why should somebody else pick the slack up for AMD? Does AMD pick the slack for Intel as well? And who would them keep up with their upcoming silicon etc. (It wouldn't be AMD, since this approach makes it easier for them). Why would one company be allowed to sidestep what is required to get your code into the kernel? Is it because they showed some good will in the past? Is that why inferior code should now be allowed?
> they want AMD to dance to the kernel developer's piping.
So let me get this straight. AMD wants to merge their code into the Linux kernel, (not the other way around), but it's the kernel developers who should instead "dance to the AMD developer's piping"?
Look, It's good that they are trying and obviously, the code may be accepted at some point in the future, if it's good enough, but for now, let's not let our emotion, (AMD are the good guys), override our rational thinking.
Hardware differs a lot in complexity. GPUs are very complicated.
Independently: The job security of these people also depends on the fact that it is so politically involved to get "their" driver into the kernel. So they surely have no incentive to make it easier for other companies/developers to get their drivers in (for example by very stable internal kernel interfaces).
> AMD wants to merge their code into the Linux kernel
AMD indulged on the desire of lots of Linux users for open source drivers. They did their job. NVidia did nothing.
Yes, Intel has an open-source GPU driver in the kernel, AMD can too, they just need to follow the conventions.
> they surely have no incentive to make it easier for other companies/developers to get their drivers in (for example by very stable internal kernel interfaces).
The kernel interface around DRI is actually quite stable, I don't think making it hard for AMD to merge in their driver would help with anybody's job security. It would certainly not be enough to affect GPU market share in a significant way, so it would be very risky for little gain?
Why invent conspiracy theories, rather than just accept the far more likely explanation that AMD's code is not up to the job?
> AMD indulged on the desire of lots of Linux users for open source drivers. They did their job.
No they didn't. If they wanted to fulfil the promise of delivering a kernel driver because their users demanded it, they would have done their job, had they produced code that follows the conventions and work with the maintainers to get the code accepted.
Throwing some code over the wall, does not meet any reasonable definition of "doing their job".
> NVidia did nothing.
Ah, so now it's about, "but look over here, they're even worse!"
I mean, OK, NVidia did nothing, AMD did something, but not enough. Intel did even more than AMD did and is still not perfect. Why couldn't AMD be at least as good, if not better than Intel? If you want to compare, why compare against the worst, rather than the best player?
And, why can't we judge this independently?
Irrespectively of NVidia or Intel, this is what AMD produced and it's not yet good enough.
[0] https://lists.freedesktop.org/archives/dri-devel/2016-Februa...
Sometimes drivers opt out because the hardware is odd enough that it needs to do things differently, or because it offloads to hardware (or proprietary firmware) functionality that is usually done in software. But often, it's just because a driver was written in isolation by the manufacturer and then dumped on the community. (See [2], from the comments in [1], about the work required to clean up some Realtek WiFi drivers enough to be merged to the kernel staging area.) If a driver unnecessarily opts out of common frameworks and does things internally and differently, it can be hard to even evaluate whether problems fixed in the standard frameworks exist in the special snowflake drivers. Even after identifying a problem, the recipes that worked to fix the standard drivers won't apply.
[0] https://lwn.net/Articles/454390/
[1] https://lwn.net/Articles/705884/
[2] https://www.linuxplumbersconf.org/2016/ocw//system/presentat...
Proprietary drivers are tolerated, not liked, and people aren't interested in making it easier for them.
The majority of APIs are not exposed to NDK users, only to OEMs.
Google could release lets say Android 8 with another POSIX compliant kernel and the only apps that would notice are the ones using non official APIs.
Currently, Android is released using Linux kernel, ELF executable format, POSIX API's, and so on. There is no Android/kFreeBSD, nor Android/NT.
Android games can be launched on Linux using android libraries (not all, but some works pretty well, see: http://www.shashlik.io/showcases/ ).
Linux tools can be launched on Android systems (including X based tools, if X server is running).
For most of practical purposes, Android is Linux.
Debian user space isn't the same thing as a kernel.
Android kernel doesn't expose the same syscalls as a standard Linux.
I am really keen in having Google replacing Linux with Magenta, then we can carry on this discussion about what Android is supposed to be.
Compiling and using a vanilla kernel instead of the distro one is straightforward and easy job for people with basic knowledge of source building. It's not a job for "very few brave souls". The reason many people use distro kernels is because they're good enough.
OTOH any consumer Android device requires millions of loc patching to a several years old version of Linux kernel just to boot.
And, frankly, if Google decided to swap out Linux for, say, DragonflyBSD, almost no-one would notice or care.
That said, the community isn't afraid of breaking changes to push future versions forward.
Linux doesn't follow SemVer.
Where in Android, you just get Java and a tiny bit of C and C++.
Check the NDK documentation, Google provides a list of the set of APIs any NDK application is allowed to use.
Since many used to ignore that list, starting with Android 7, any app that uses unauthorised native libraries will get killed.
Besides, Nvidia's been having trouble with their Tegra GPUs on Android, and as a result have been forced to pitch in a bit on Nouveau (the reverse-engineered open-source Nvidia driver). They're still having trouble with their driver situation on mobile, as a result of their unwillingness to play ball with the kernel.
Actually, that last sentence above - I'm really not too confident on that, I've heard various hearsay but the only source I concretely remember is the "other drivers" section of http://richg42.blogspot.com.au/2014/05/the-truth-on-opengl-d...
Nvidia has every ability to ship it, they just refuse to open it.
ELI5: why does each Linux kernel release break driver code? It can't be THAT hard to just have a stable interface and leave it for long periods of time, e.g. only bumping it on major version bumps in the Kernel?
There is no rule kernel interfaces can only change on major bumps. In reality, they change quite frequently, as new APIs and drivers are merged in, which requires generalization, refactoring, etc across API boundaries to keep things sane. Kernel developers specifically reject the notion of a "stable ABI" like this because they feel it would tie their hands, and lead them to design APIs and workarounds for things which would otherwise be fundamentally simple if you "just" break some function and its call sites. APIs in Linux tend to organically grow, and die, as they are needed, by this logic.
Why wait 5 years for a "major version bump" to delete an API call, you could just do it today and fix the callers, since they're all right there in the kernel tree? It's far easier and more straightforward to do this than attempting to work around "stable" systems for very long periods of time, which is likely to accumulate cruft.
Because they do not care about out-of-tree code, when an API changes, their obligations are to refactor the code using that API, inside the kernel, and nothing else. That means the person making the change also has to fix all the other drivers, too, even if they don't necessarily maintain them. Out of tree users will have to adapt on their own.
This also explains why they do not want a HAL. When a Linux driver interface changes, the person changing it is responsible for changing everything else and fixing other drivers. That means if AMD wants a large change, it may have to go and touch the Intel driver and refactor it to match the new API. If Intel wants something new, they may have to touch the AMD driver in turn. This, in effect, helps reduce the burden and share responsibilities among the affected people.
They don't want a HAL because a HAL is a massive impediment to exactly that workflow. If Intel wants to improve a DRM/DRI interface in the kernel for their GPUs, they could normally do so and touch all the other drivers. Out with the old, in with the new. But now, they'd have to also wade through like 50,000 lines of AMD abstraction code that no other system, no other driver, uses. It effectively makes life worse for every graphics subsystem maintainer when this happens, except for AMD I guess since they can pawn off some of the work. But if AMD plays by the rules -- Intel fixing their AMDGPU driver when they make a change shouldn't be that unusual, or any more difficult compared any other graphics driver. And likewise -- AMD making a change and having to fix Intel's driver? That's just par for the course.
Obviously Linux isn't perfect here and they do, and have, accepted questionable things in the past, or have rejected seemingly reasonable API changes out of stability fear (while simultaneously not wanting a stable ABI -- which is fair). But the logic is basically something like the above, as to why this is all happening.
> https://opensource.org/osd-annotated
The reasons why "free software" people don't like the word "open source" are indeed political:
> https://www.gnu.org/philosophy/open-source-misses-the-point....
For software for which the source code is available, but does not give the four freedoms:
> https://www.gnu.org/philosophy/free-sw.html#content
it is common to use the word "shared source" (originally devised by Microsoft):
Note that both the amd and the nvidia kernel modules always have been FOSS because of the GPL license. It's just that nvidia provides it by its own ways, not through the official linux branch, and thus doesn't have to respect linux rules nor to document the driver.
Only open source part of their modules was shim while 99% of driver is contained in blob. That's true for both Nvidia or ATI/AMD fglrx.
I don't think the AMDGPU driver not being mainlined really affects Linux. On the other hand, AMD will really benefit from it.
Since the kernel maintainers, and other open source contributors wrote drivers?
> I don't think the AMDGPU driver not being mainlined really affects Linux. On the other hand, AMD will really benefit from it.
You remember the days when Linux never would work on any real PC because audio drivers, GPU drivers, everything was missing? Do you want those back?
Having shitty drivers in the kernel isn’t ideal either, just like microkernels vs. monolithic kernels are a tradeoff, but at least cooperating with them in how to get it best into the kernel (AMD rewrote ~100k LOC since the last "Nope" from the maintainers) would be a lot more helpful than this.
Is that the lesson we want to give to companies?
When they're equivalent products there's really not much to pay for though. That should push the costly product to improve more or else lose sales.
A driver is inherently platform-specific. It's glue that ties the hardware to the operating system. The only "correct" way to have one driver work on multiple operating systems is for the operating systems to all use the same driver model.
The ugly way is to create your own hardware abstraction layer and then write a translation layer between that and each operating system, because that's complicated and hideous.
But it's especially silly because Linux accepts suitable contributed code, so you could instead use the native Linux model as your "intermediary layer" and fix Linux if it isn't suitable in some way. And then translate that to what the closed operating system you can't modify uses.
The result is that the Linux people are happier and you have one less translation layer to maintain.
One might ask whether it is desirable to avoid the GPL, and there are a lot of arguments on both sides there, but it's certainly easy to run into issues when you have a GPL licensed module designed to be linked into a proprietary program (kernel).
Isn't the point supposed to be to not have other versions of your driver, so you can use the same one on every platform?
But Linux repesents a tiny portion of the gaming community, so that approach would make no sense at all for a GPU vendor. C'mon.
I know that Linux people really really just want the kernel to take one for the team so they can have GPUs because that's just the goal, and clearly the goal is good and the means don't matter at all and everything else is irrelevant. 100,000 lines of crap code, 200k? 500k? Who cares, it's all in the name of GPUs clearly. It's obviously worth it no matter what.
But the kernel developers do not see it that way, and for good reason -- because once it's in tree, they are all on the hook for it and they all have to deal with the swamp, the added complexity, the maintenance, the un-fucking of this entire HAL, etc etc.
Having worked on a large open source project, I can assure you, it sucks when you have to say "This isn't acceptable and we aren't merging it", even when it's a feature the users want, and one someone worked on for a long time. It is also, almost always, the right thing to do in the long run (and several of those features did come back, in acceptable ways, in our case).
The growth market for GPUs is GPGPU and servers. And Linux represents a large portion of the programming and server communities.
More to the point, as soon as you support Linux at all then it doesn't matter who has more share, it's still less work to do the above than have to maintain another translation layer.
And you really believe that the maintainers will be accepting a giant patch that changes the API and subsystem completely (though into something better) that has the risk of causing lots of regressions to existing drivers? And you believe that AMD is supposed to fix all the regressions that are caused in drivers by other vendors that this change causes?
And yes - who else is supposed to fix all the regressions caused by changes that AMD wants? Volunteers who would rather work on something else? If you want a change, you get to support the regressions - and if AMD's work gets merged, then anyone ELSE who wants to make a change in that page needs to support AMD's regressions.
Hence wanting to make sure that the changes from AMD are manageable and flexible enough to allow further changes.
If the driver doesn't really belong in the Linux kernel source for those reasons, it's better to keep it outside the kernel tree.
code re-use between drivers of different vendors but the same kernel/OS,
VS
code re-use between drivers of the same vendor but different kernels/OSes.
At the end of the day, both sides are arguing for code re-use, of sorts.
This might theoretically make sense if the Linux subsystem was very stable over many years. Practice shows that the Windows interfaces are what are a lot more stable over the years and changes in them are communicated for a long time beforehand so that hardware vendors can begin changing their drivers long beforehand.
However, it is introducing a second API for a very specific subset of hardware into a kernel that is being developed by not just AMD people. Dave Airlie is rightly saying that the second API and hence two different code structures makes the whole DRI infrastructure harder to maintain for everyone else.
And Dave's responsibility is to everyone else, not to AMD.
It is a bad thing for the targets as they implement both the driver functionality and the abstractions required to make the same code work cross platform. The response linked describes the cost of those abstractions to the target (Linux kernel in this case).
According to
> https://technet.microsoft.com/en-us/library/eb9b35a3-64e9-4c...
the central component that Microsoft requires is a code signing certificate (i.e. money and perhaps a little boring bureaucracy, which still can be done in a very systematic way). What Microsoft tests internally at the drivers is whether they have potential dangerous security bugs (e.g. buffer overflows). You can architect the drivers as you want to (though Microsoft provides guidelines and reference source code to make it easier to write drivers "the officially desired way").
Getting the drivers into the Linux kernel means - as one can see - going deeply into kernel politics. If Microsoft required something similarly, the hardware vendors would tap their forehead at Microsoft.
It's not as though if AMD had developed drivers for Linux first that they could then go bully MS into allowing them to patch the Windows kernel so that they didn't have to modify their Linux drivers to get them ported to Windows.
I'd imagine if you wanted to include code in the NT kernel they would. But since they don't allow you to include your code in the NT kernel at all, they don't require what the Linux kernel developers require,
i.e. MS is not providing the same level of access, so it doesn't have the same requirements.
Fork Android OSP and fix that. I'm sure that CyanogenMod will allow me to use native libraries as much as I want.
BTW. I'm not sure if my distro (Fedora) will work with vanilla kernel without Redhat patches. There was times when it wasn't. I compiled kernel myself with my own patches and configuration in between 2001-2008.
What I'm trying to tell is it being possible doesn't mean it is practically possible.
Android specific API or subsystems are one part of it, then there is device specific patches. Kernel and patches being open source makes it possible to switch to a new kernel version but I've almost never seen that exercised. Few years ago the stats were that a typical consumer phone contains millions of line of patches on top of the selected upstream kernel version. It is not feasible to rebase a typical device to use a newer kernel version, and it really shows: I've seen only few phones that got a newer kernel version than it originally shipped with, switching from an ancient kernel version to only slightly newer but ancient kernel version.
> BTW. I'm not sure if my distro (Fedora) will work with vanilla kernel without Redhat patches. There was times when it wasn't. I compiled kernel myself with my own patches and configuration in between 2001-2008.
I'm pretty sure it'll. Linus himself uses Fedora and he likes his kernel pure vanilla :)
No point trying to play D. Quixote attempting business on the desktop with such mentality.
Linux maintains compatibility by fixing the driver themselves when they break it. Microsoft cannot (actually, can, and does) break their interfaces since they don't control the drivers.
This allows Linux to keep improving without breaking things in production; while Microsoft has to either maintain huge backward compatibility abstractions for changes, go YOLO and break stuff (often unknowingly) or abstain from improving their OS.
Game developers might like to see clean driver source but they don't get to choose what kind of GPU their customers have already bought. And 99% of gamers are not going to choose their GPU based on Linux drivers. So nobody has any leverage and vendors have no incentive to change.
Meanwhile thousands of universities and institutions are each going to be looking for 25,000 GPUs and they can choose what brand they buy based on what makes their internal developers happy. Hosts like Amazon and Google are each going to be buying millions of GPUs, and having better and more transparent drivers so they can more easily e.g. improve power consumption by a small percentage, can save them a million dollars/year in electricity.
Someone like Google could come to each vendor and say "first to have mainline kernel drivers gets all our business" at any point. Or the same result in the other order; once there are clean drivers third parties are more likely to make power consumption and performance improvements that give AMD the edge when the major customers crunch the numbers.
There is a significant competitive advantage in it for AMD to get this right.
I can't speak to the legal status of GPL drivers for Windows, but several seem to exist already (e.g. Windows ext4 driver), and if they were actually worried about it they could always get explicit permission from the copyright holders of the relevant code. Either they say yes and you're fine or they say no you know what pieces of code to replace.
And what about a change to a stable internal kernel API, which the kernel developers refuse?
https://www.kernel.org/doc/Documentation/stable_api_nonsense...