Django: Reformatted code with Black

Django: Reformatted code with Black(github.com)

284 points by tams 4 years ago | 245 comments

samwillis 4 years ago |

I believe from memory Django decided to move to using Black back in 2019 [0] but delayed the change until Black exited Beta. Black became none beta at the end of January [1].

This was finally merged to the main branch today [2].

I suspect there are lots of other both open source and private projects that are also making the change now. This is a show of confidence in Black as the standard code formatter for Python.

0: https://github.com/django/deps/blob/main/accepted/0008-black...

1: https://news.ycombinator.com/item?id=30130316

2: https://github.com/django/django/pull/15387

dwightgunning 4 years ago | |

This is right. Black emerging from beta was discussed on the Django mailing list in the last week or so, and triggered the work.

bwhmather 4 years ago |

Shameless plug: For people who like black, I've been working on ssort[0], a python source code sorter that will organize python statements into topological order based on their dependencies. It aims to resolve a similar source of bikeshedding and back and forth commits.

0: https://github.com/bwhmather/ssort

evilsnoopi3 4 years ago | |

We use isort[0] for this. It even has a "black" compatible profile that line spits along black's defaults. Additionally we use autoflake[1] to remove unused import statements in place.

[0](https://github.com/PyCQA/isort)

[1](https://github.com/PyCQA/autoflake)

bwhmather 4 years ago | | |

isort only sorts imports. ssort will sort all other statements within a module so that they go after any other statements they depend on. The two are complementary and I usually run both.

drcongo 4 years ago | |

This is relevant to my interests. We have an internal code style guide at my company that includes guidelines for order of class statements, roughly matching yours. I have one pet peeve that made me write the style guide in the first place - Django's `class Meta` which we always have at the top of the class because it contains vital information you need to know as a programmer, like whether this class is abstract or not. Whenever I have to work with an external Django codebase and find myself scrolling through enormous classes trying to find the meta my blood pressure rises.

bwhmather 4 years ago | | |

I've had the same problem with pydantic. Currently, properties are special cased and moved to the top. Everything else, including classes, is grouped with methods. Meta classes will end up somewhere in the middle, which is probably the worst possible case.

SSort is currently used for several hundred kilobytes of python so I'm wary, but if I'm going to make a breaking change before 1.0 then I think this is likely to be it.

atoav 4 years ago | |

Some illustrative before-after syntax-highlighted code segments would be a nice addition for the readme.

l-lousy 4 years ago | | |

He added some :)

VWWHFSfQ 4 years ago | |

Looks cool but it seems like it might still need some work?

I tried it on one of my Django `admin.py` files and it created NameErrors.

    class TestAdmin(admin.ModelAdmin):
      list_filter = ("foo_method",)

      def foo_method(self, obj):
        return "something"

      foo_method.short_description = "Foo method"

    # It turned it into this:

    class TestAdmin(admin.ModelAdmin):
      list_filter = ("foo_method",)

      # NameError
      foo_method.short_description = "Foo method"

      def foo_method(self, obj):
        return "something"

bwhmather 4 years ago | | |

Yup, that's a bug. All assignments are treated as properties and moved to the top. Fix to follow shortly.

stjohnswarts 4 years ago | |

This sounds like a living hell if you use git diff a lot to compare for small changes that might introduce a bug? which is what happens at work all the time since our unit test and CI are a joke. Not dumping on your project but the idea of that much of a change up of the code scares the dickens out of me.

danuker 4 years ago | | |

Once the code is initially migrated (which should not break it), the diffs won't be large, since the order should be consistent.

drcongo 4 years ago | | |

Use it at the editor level instead of in CI and I can't see how it can cause you any problems at all. I could easily be missing something though?

aaronchall 4 years ago | |

Does it put high-level business logic at the top or the implementation details at the top?

Which is preferable, and why?

bwhmather 4 years ago | | |

Implementation details at the top. Python is a scripting language so modules are actually evaluated from top to bottom. Putting high level logic up top is nice when you just have functions, which defer lookup until they are called, but you quickly run into places (decorators, base classes) where it doesn't work and then you have to switch. Better to use the same convention everywhere. You quickly get used to reading a module from bottom to top.

BiteCode_dev 4 years ago | |

Very interesting, especially the method order part. I dislike the order you chose, and yet, I would be tempted to use it on my projects anyway, because being congruent is so important to me.

Bedon292 4 years ago | | |

This is about how I initially felt with black. I didn't like some of the things it did, but I was happy to have a standardized opinionated formatter so I went with it. Was definitely the right decision.

JshWright 4 years ago | | |

A standard no one likes is often better than no standard at all.

progval 4 years ago | |

Could you show an example in the README? The first two pairs of input/output in https://github.com/bwhmather/ssort/tree/master/examples look unchanged

bwhmather 4 years ago | | |

Will do. Examples directory isn't terribly helpful as documentation as it mostly contains real code with problematic syntax (and compatible licensing) that tripped up ssort when I ran it on a copy of my pip cache. I will move it into tests to avoid confusion

BeFlatXIII 4 years ago | |

Thanks for sharing this. When solo coding, I tend to dump new classes and functions wherever is physically closest to where I was previously editing. It makes sense in the moment so I don't disrupt my train of thought by jumping all over the file, but then is a confusing ball of mud when I need to return to the project after time off. Was the shortest scroll direction up or down when I implemented it? etc…

jansky 4 years ago | |

this is great. Imagine i declare global variable which is used in function which is defined AFTER this global variable is declared (filled by value) and then function is executed later. Why does ssort put my declaration/filling of global variable before that function declaration?

def myfunc(): global globalvar str(globalvar)

globalvar='abc'

myfunc()

will be transfered to

globalvar='abc'

def myfunc(): global globalvar str(globalvar)

myfunc()

I understand why is it done but i dont want to have function definition block filled with this declaration of variables (which i do later) since it has no impact to my code and it makes is just a bit "cleaner". Dont tell me to not use global variables :D

mulmboy 4 years ago | |

Sounds interesting and perhaps novel. Might help if there were an example or two in the readme - as it is I still don't exactly know what this is.

dopeboy 4 years ago | |

Very cool - I'll be following this.

mrtranscendence 4 years ago |

I've been using black at work for over a year now. I don't much care for some of the choices it makes, which can sometimes be quite ugly, but I've grown used to it and can (nearly) always anticipate how it will format code. One nice side effect of encouraging its use is how, at least where I work, it was very common to use the line continuation operator \ instead of encompassing an expression in parentheses. I always hated that and black does away with it.

What I don't much care for is reorder-python-imports, which I think is related to black (but don't quote me). For the sake of reducing merge conflicts it turns the innocuous

from typing import overload, List, Dict, Tuple, Option, Any

into

from typing import overload

from typing import List

from typing import Tuple

from typing import Option

from typing import Any

Ugh. Gross. Maybe I'm just lucky but I've never had a merge conflict due to an import line so the cure seems worse than the disease.

Edit: Just to be 100% clear: this is python-reorder-imports, not black. I thought they were related projects, though maybe I'm wrong. Regardless, black on its own won't reorder imports.

VBprogrammer 4 years ago |

Reading some of the comments here it's become clear to me that the next stage in the development of auto-formatters is to have the formatter commit the code as a canonical format but to display the code to each individual contributor in the style of their choosing. Thus removing all kinds of arguments about whether 80 or 120 columns is the one true width.

TheRealPomax 4 years ago |

The reason to use Black is the same as Prettier on the HTML/CSS/JS side: forever stop having an opinion on code style, it's wasted time and effort. Any "it's not exactly what we want" comment with an attempt to customize the style to be closer to "what we were already using" is exactly why these things exist: by all means have that opinion, but that's exactly the kind of opinion you shouldn't ever even need to have, tooling should style the code universally consistently "good enough". Which quotes to use, what indent to use, when to split args over multiple lines, it's all time wasted. Even if you worked on a project for 15 years, once you finally add autoformatting, buy in to it. It's going to give you a new code style, and you will never even actively have to follow it. You just need to be able to read it. Auto-formatting will do the rest.

wraptile 4 years ago | |

Except Python is a general purpose programming language so it's hard to have 1 shoe fits all solution when style vary based on medium you're working with. Are you making an OOP GUI app? Django? Something that is using loads of long Xpaths?

yurishimo 4 years ago | | |

I don't know if that applies. Ideally, a good code formatting tool would work with any project. If there is a specific flag you want to disable for some block to use your own format, then the tool should support that.

As a couple of examples, PHP has had a unified formatting standard since 2013 and Elixir has a formatter built into the language. Both languages need the formatter to be enabled by your IDE/CI and that's also the case for Black.

pyuser583 4 years ago | | |

Python throws exceptions if you don’t have the right number of indents.

declnz 4 years ago |

Aside: I love a good linter, but as a long-time Python fan I find it sad that Black has so little configuration (yes, I know, but still) and moreover that it often produces code that no human Python dev I know would write...

Python was always meant to look concise / beautiful... (MyPy has also made this trickier too)

wyuenho 4 years ago |

Every time I was tempted to do something like this, I hesitated because I didn't want every other line in every file with my name on a single commit, mostly to avoid making git blame harder than necessary. It would be nice if there was a kind of diffing algorithm that can diff code units *syntactically* across history.

tomp 4 years ago |

worst things about Black:

- doesn't respect vertical space - sure, making the code fit on screen might be valuable (though the default width should be at least 120 characters, I mean we're in 2022 after all), but Black does it by blowing up the vertical space used by the code

- spurious changes in commits - if you happen to indent a block, Black will cause lines to break

- Black fails at its most basic premise - "avoiding manual code formatting" - because a trailing comma causes a list/function call to be split over lines regardless of width

codingkev 4 years ago |

A little shoutout to a alternative Python formating tool https://github.com/google/yapf (developed by Google).

The built in "facebook style" formating felt by far the most natural to me with the out of the box settings and no extra config.

timhh 4 years ago | |

I did a blind survey of YAPF vs Black at my work. The results came back as 70% in favour of Black.

Black gives generally nicer output, and also more predictable output because its folding algorithm is simpler. YAPF uses a global optimisation which makes it make very strange decisions sometimes. Black does too, but much less often.

There are also non-style problems with YAPF. It occasionally fails to produce stable output, i.e. yapf(yapf(x)) != yapf(x). In some cases it never stabilises - flip flopping between alternatives forever!

Finally it seems to have very bad worst case performance. On some long files it takes so long that we have to exclude them from formatting. Black has no issue.

In conclusion, don't use YAPF! Black is better in almost every way!

VectorLock 4 years ago | | |

How did you perform the blind survey? Format some code with Black and YAPF and ask people which they liked better?

lelandbatey 4 years ago | |

YAPF is slower than Black for many degenerate cases, a fact I notice most strongly since I use an "auto-format file on file save" extension in my editor. The case I found in particular was editing large JSON schema definitions in Python, as they're represented as deeply nested dictionaries. Black seems to format them in linear time based on the number of bytes in the file, while YAPF seems to get exponentially slower based on the complexity of the hard-coded data structure. It was a niche case, and the maximum slowdown was only ~1-2 seconds, but that editing freeze was quite annoying.

BiteCode_dev 4 years ago | |

yapf is configurable, and that's why it never won.

crad 4 years ago | | |

What's wrong with configurable? Too much opportunity to bikeshed?

I figured yapf was not "new" which is why black won.

Starting about 5-6 years ago there was a push in the Python community to replace solved problems with new ones in what appears to me as chasing the JavaScript community.

Instead of consolidating on existing tools that worked well but had some rough edges to smooth out, numerous projects came about to reinvent the wheel.

daenz 4 years ago |

I'm so happy that languages are settling more and more on heavy reformatter usage. I'd like to think it was triggered by Go and gofmt. Working on a team where each engineer has their own personal syntax is not fun.

belval 4 years ago | |

Indeed, I don't like Black's style, but I prefer working in a Black codebase than one where everyone has their own preference. Having style guidelines in a team is also a great way to remove pointless debates when reviewing PRs.

daenz 4 years ago | | |

Agreed, and what's interesting is that despite all of those pointless style debates, there hasn't been much pushback on using reformatters (that I've seen). This tells me that the debates weren't really about "my style is objectively best" but more about "I'd like to use a consistent/predictable style (with preference to mine)."

declnz 4 years ago | | |

...which is why I wish Black allowed more configuration. A team can often agree on a set of styles. Every team on the Python planet agreeing... now that's much harder

kaesar14 4 years ago | |

Go and gofmt definitely pushed a lot of the momentum of the current wave but don't forget to give respect to Ruby / Rubocop where it's due, where the adage of Convention over Configurability has reigned supreme for decades.

NegativeLatency 4 years ago | | |

Rubocop has about a thousand config options

glacials 4 years ago |

Black is slowly creeping into gofmt-level universality in the Python community and it’s great. The next big milestone is a first-party recommendation by python.org itself.

VWWHFSfQ 4 years ago | |

I'm pretty sure it's a PSF project

spc476 4 years ago | |

No, the next big milestone is embedding the format style as the syntax of the language. I'm curious as to why Go didn't even do this (they should have, in my opinion, but wimped out and left it to an external tool).

shpx 4 years ago | | |

If they change

    print(repr('some string'))

to print

    "some string"

instead of

    'some string'

then that would remove the only hangup about Black that I have.

ibejoeb 4 years ago |

In general, what are the strategies for large public codebases like this to mitigate supply chain attacks or other source-level attacks?

For clarity, I'm hoping to open us discussion about how we're dealing with massive changesets like this that are difficult to review due chiefly to the breadth of it.

sciurus 4 years ago | |

For a purely mechanical change like this, someone could run black against the same revision of Django and verify the changes they see locally match the changes in this PR.

ibejoeb 4 years ago | | |

That's true as long as the results are predictable and reproducible. I don't happen to know if Black is, and it's not apparent from the documentation.

Update: Found it:

> How stable is Black’s style?

> Starting in 2022, the formatting output will be stable for the releases made in the same year

https://black.readthedocs.io/en/stable/faq.html

fritzo 4 years ago | |

Interesting! Can you help me imagine attack scenarios? All I can think of is:

- The changeset is authored by a trusted committer but the committer's tools have been locally compromised.

- The public tool itself (e.g. black) has been compromised to automatically create vulnerabilities in difficult-to-review bits of code (a Ken Thompson hack).

jamessb 4 years ago | |

As a reformatting tool should only change the formatting, you could check that the Abstract Syntax Tree is unchanged. The ast module in the standard library gives access to the AST [1].

[1]: https://docs.python.org/3/library/ast.html

justinmchase 4 years ago |

The output does look better but this also just looks like every PR for applying a linter / formatter I've ever seen. Not sure why this is news worthy.

simonw 4 years ago | |

It's a significant milestone in the adoption of Black by influential projects within the Python ecosystem, which makes it a good hook for discussing the idea that Black, now stable, is becoming established as the standard for code formatting for Python.

owaislone 4 years ago | |

Using black is not about how the code looks but to eliminate an entire suite of review comments/discussions. Everyone simply runs black over all code before submitting and no one ever comments about how anything is formatted.

captainmuon 4 years ago | | |

Naive question, but why is everybody so aggravated by formatting discussions? It seems to be a widespread opinion that these discussions are just 1) pointless and 2) difficult and time consuming.

My personal experience is that 1) in many cases you do benefit from taking a moment, going through your code and thinking about presentation. And 2) I find it not at all difficult to settle. A change either doesn't matter, then you just don't discuss it at all, or it is important, then you quickly agree on the best solution. (In the worst case, "best" means what the project lead finds prettier.) If you don't have a social mechanism to agree on something as basic as coding style, then your team probably has bigger problems.

I actually find robo-formated code annoying to read: Go code from a bloody beginner who doesn't know what they are doing looks exactly like carefully tended for, highly thought-out code. And in autoformatted Python, you for example cannot make formulas clearer by removing spaces around operators with higher precidence. Parentheses placement is dicated by how long words are and not by what logically belongs together, etc..

mbot5324 4 years ago | | |

By chaining yourself to a format preferred by a machine, you free yourself of having to understand how and why another human thinks the way they do and prefers what they prefer.

Simply give up your mind and you too can be free.

tayo42 4 years ago | | |

With a style guide and linter I've never experienced this and idk why you would. Then the only time style comments come up is pointing someone to the guide

VWWHFSfQ 4 years ago |

So now when you look at the annotated change history all you're going to see is a bunch of changes by the person that reformatted the code instead of the person that wrote it.

tempay 4 years ago | |

The `.git-blame-ignore-revs` file can be used to ignore that (and will be [1]). Unfortunatly GitHub doesn't support it but at least it's possible to have clients behave in a reasonable way.

[1] https://github.com/django/django/pull/15387#issuecomment-103...

terr-dav 4 years ago | | |

You can automate setup for developers using this simple script:

https://github.com/ipython/ipython/pull/12091/files

And here’s a GitLab issue requesting support for blame-ignore:

https://gitlab.com/gitlab-org/gitlab/-/issues/31423

I don’t think there’s a corresponding GitHub request, but maybe if GitLab adds this feature GitHub will have some incentive to follow suit.

sciurus 4 years ago | | |

For anyone looking for more explanation of this feature:

https://michaelheap.com/git-ignore-rev/

rowanseymour 4 years ago |

I love this except the use of the default black line length of 88. One of the things I appreciate about gofmt is being trusted with deciding on line breaks.

NAHWheatCracker 4 years ago |

I suggested Black to a team I was on a year ago and one developer hemmed and hawed about how he likes to format arrays or something. I didn't win any friends by pointing out that disregarding those personal preferences is part of why I was recommending it.

A year later and it seems to be the default on all projects I'm working on and I'm loving it.

themeiguoren 4 years ago | |

Autoformatters are hell for 2d arrays of data where the columns have meaning and you want them to be aligned (time series, matrix math). It’s my only real gripe.

jnothing 4 years ago |

Why is it impossible to rebase? I didn’t understand the conversation around merging and rebasing

vitorfs 4 years ago |

This is such a great news. We've been using Black in the company that I work for the past 3 years or so and it was a game changer for code reviews. Hopefully other open source Python/Django projects will follow the lead.

umvi 4 years ago |

What's the point of putting linters into CI? Is the point to fail the build if the code wasn't pre-formatted with i.e. Black? Or is the point to autoformat and autocommit the formatted code?

bckr 4 years ago | |

> Is the point to fail the build if the code wasn't pre-formatted with i.e. Black?

It's this one

> Or is the point to autoformat

This one is done with pre-commit (which should probably be named pre-push?) hooks

> and autocommit the formatted code?

I don't think this one is done, and I think it's undesirable

mkesper 4 years ago | | |

Pre-commit hooks really happen when you type 'git commit'. If you have failing checks in them, your commit will be aborted.

selestify 4 years ago | |

> Is the point to fail the build if the code wasn't pre-formatted with i.e. Black?

It's this. Ensures that anything merged to master keeps the formatting conventions established in the project.

seattle_spring 4 years ago | |

The former, in my case. Last thing I want is someone merging their own "creative interpretation" of proper formatting.

euler_angles 4 years ago |

Had a great experience with black. Only thing I did was change its default line length limit to 120 characters (I was regularly dealing with signal names from source data that were about 90 chars).

wolverine876 4 years ago |

Do Black and other autoformatters enable significantly more reusable code and computer-generated code? Formatting is certainly not the only or greatest barrier, but if format is standardized across projects, it's easier to plug and play code from outside.

ReleaseCandidat 4 years ago |

I would really appreciate if there would exist exactly _one_ formatter (without any options) per language.

It is way better to deal with ugly formatting as long as it is consistent than with discussions where to put a closing brace/bracket/paren.

MahajanVardhan 4 years ago |

I am so sorry, but what is Black? I use django but I have never heard of Black

rcv 4 years ago | |

Black is a tool that can reformat Python code. It's remarkable for it's lack of configuration.

https://github.com/psf/black

SoylentOrange 4 years ago |

I’ve been using black for about a year and I’m generally a big fan. However my biggest gripe with it is bad VS Code integration.

claytonjy 4 years ago | |

bad how? i use vscode, I save a file, it reformats on save, that's it.

phplovesong 4 years ago |

Good bye git history!

Noumenon72 4 years ago | |

They used .git-blame-ignore-revs.

yedpodtrzitko 4 years ago | |

hello .git-blame-ignore-revs

supreme_berry 4 years ago |

“Black” developer refused for a long time to add option to format code with single quotes with very aggressive manners. Now Django devs didn’t see that option for single quotes and code looks unpleasant.

vitorfs 4 years ago | |

I have always used single quotes for Python code since I start working with it. When I started to adopt Black on my projects it indeed felt weird and the code looked unpleasant. But after a while you get used to it.

Some people make the case that it's easier to write single quotes (well, depending on the keyboard format anyway). For keyboards in the US standard you have to hold the Shift key to write a double quote. But the good thing about Black is that you can still write your code using single quote and when you run the command line utility it will fix/normalize the code to use double quotes.

Nowadays I got so used to it that I even write my Python code using double quotes. And looking at Python code using single quotes looks weird/unpleasant for me.

spc476 4 years ago | | |

I use single quotes for items that, while technically a string, could be considered a value or symbol. For example:

     syslog('debug',"Just opened %s for output",filename)

While there's no semantic difference between single and double quote, in my code base, there is. And if black becomes very popular, why even support single quotes anymore?

digisign 4 years ago | | |

The repl still uses single quotes.

INTPenis 4 years ago | |

I reacted to this too, in the changed files tab.

Technically single or double quotes have the exact same meaning in Python. What makes people use single quotes is probably other languages like PHP, Perl and Bash.

I know I've made it a habit to default to single quotes unless I know I need double quotes. So that might be where the habit comes from in the Django project. But it's not actually necessary in python so might as well use the most commonly used type of quote.

pchf 4 years ago | |

To keep the single quotes, which in my opinion make the code less cluttered and closer to the REPL, I use the pre-commit hook double-quote-string-fixer, in conjunction with black's option skip-string-normalization set to true.

dplgk 4 years ago | | |

And black is supposed to make our lives easier?

digisign 4 years ago | |

Use nero or blue instead, which both use single quotes.

$ cat foo.py from x import a, b, d, e from x import c as C $ isort foo.py Fixing /tmp/foo.py $ cat foo.py from x import a, b from x import c as C from x import d, e