Edit: based on your profile I see you're quite anti Microsoft for some reason. I think one doth protest too much in this case though.
[1] https://repography.com/app/0/strawberry-graphql/strawberry/s...
Also shout out to the (mostly useless but cool looking) git history visualizer, Gource[0].
I'd say that Gource is actually pretty useful for figuring out where most of the effort has been concentrated recently! For example, when added to a new project, I might run it against the repo to see which packages in the project have been changed the most in the past month, what people are working on and so on.
git log --pretty=format: --name-only | sort | uniq -c | sort -rg | head -n 30Otherwise looks really cool.
I couldn't resist and threw `gitoxide` at it, and it turned out to be more than 2x as fast (even though it uses way more CPU to do that, there is definitely room for improvement).
The PR which adds the `db-gen` program: https://github.com/jmforsythe/Git-Heat-Map/pull/6
File "git-database.py", line 263, in <module>
main()
File "git-database.py", line 247, in main
last_commit = handle_commit(cur, lines)
File "git-database.py", line 178, in handle_commit
handle_match(cur, matches[i], commit_lines[2+i], fields)
File "git-database.py", line 197, in handle_match
p, n = secondary_line.split("|")
ValueError: not enough values to unpack (expected 2, got 1)[0] https://wakatime.com/blog/58-chatgpt-prototyped-our-new-feat...
https://github.com/nixos/nixpkgs/ would be a great benchmark for a tool like this :) One of the larger repos on github, close to half a million commits by a large set of contributors to thousands of files.
[0] https://www.amazon.com/Books-Adam-Tornhill/s?rh=n%3A283155%2... [1] https://codescene.com/
Nice to see which parts are changing the most in a project to maybe see if it should be improved or at least to direct efforts of improving quality to these spots.
It would be nice if the heatmap showed addition, deletion, and possibly modification separately. I can't think of suitable visualization methods, but if I can see them individually, it gives more insights.
I reviewed the OP's code and did some benchmarks; SQLite is not the bottleneck here. The code first generates the commit info from the git log, prints that to stdin [1] and the python script reads from it one by one in a loop [2]. Each of the commit info is written to SQLite. So, with or without WAL, the time is almost the same.
To confirm my hypothesis, I ran the project without insert calls. On my machine, for cpython, it took 160 seconds and without sqlite inserts 159 ish.
I believe the git log will be fast anyway, so other ways to make it faster would be to read a bunch of commits at once and then do batch inserts. We can also make it run in parallel since each commit info is independent, and we don't need to care about ordering while inserting.
[0] - https://avi.im/blag/2021/fast-sqlite-inserts/
[1] - https://github.com/jmforsythe/Git-Heat-Map/blob/bd9bc22/git-...
[2] - https://github.com/jmforsythe/Git-Heat-Map/blob/bd9bc22/git-...
Also I wasn't sure how to get the colors :shrug:
https://github.com/breck7/jtree and https://github.com/breck7/pldb
Update: Got the email. Thank you! NEAR sent!
Yes, this is how critical paradigms like decentralization are gradually broken by Microsoft. Things are built and people have a brief look, but don't actually test them on other platforms. Don't criticize people by looking at their comment history, it's petty and frankly seems tribal. Write about the topic being discussed.
For the highlighting, currently only a single email pattern input is supported. Type in your pattern into the box at the bottom, then click submit. The pattern gets fed into a LIKE statement in SQLite, so just plaintext, with % representing a wildcard eg torvalds% would highlight files modified by an email starting with torvalds.
I had the same minor install issue, specifically needed to do the following:
>chmod +x git-log-format.sh generate-db.sh
>python -m venv .venv
>source .venv/bin/activate
>pip install flask
I'm not sure the colouring is working as expected; I tried submitting with many email addresses I know are in the repo it ran over (with/out the wildcard) and the highlighting behaviour was difficult to predict.At a glance the sh script can be trivially replaced by native python. And it would make it work cross-platform (i.e. on Windows).