GitHub Copilot is generally available(github.blog) |
GitHub Copilot is generally available(github.blog) |
What i'm concerned with is that i think it can interrupt the flow of thoughts while programming since you have to review the generated code, but that is the price to pay to use such a formidable tool
I was excited when it came out, but it ran slowly and annoyingly (crashes) and I gave up.
I already have a jetbrains license; when they release a similar feature I’ll consider uping my subscription to get it.
As it stands… eh. It’s not that great. I couldn’t really be bothering to continue using it for free, I’m not gonna pay for it.
It seem to understand the common boilerplates things in Django that always annoyed me and type them for me. It understand the structure and adapt them to my code: imports, connection between modules, etc.
For sure, you need to be carefull with it.
So I'm probably taking it.
They kinda know what they're supposed to do. Sometimes they do the right thing, sometimes they get it completely wrong.
In either case you can never let anything they do get committed without a review.
So are they really helping?
Right now, there is no competition, and an amateur developer will really benefit from copilot - certainly they will be more productive than a developer that demands just $1000 more annual salary.
Do the guys at Microsoft have any morals left?
A decade for general accessibility sounds right.
Some counters for…
CopilotRecommendations CopilotRecommedationIterations (Number of saved changes to initial recommendation) CopilotRecommmedationSaves SkipToNextSuggestion
Metrics on my code that is used by copilot by others would be nice to see.
The VS extension could check if the current git repository is open, and if so, it should work without a subscription for that specific repository.
I see this useful for non core languages, where you often need to look up common patterns.
Charge 10x more (or more) and let the dreamers help push the product further and faster. Once it’s awesome then charge a commoditized price for the service.
Charging 10x+ more means we have enough skin in the game to properly send feedback and improvement ideas. At 10/m/u it’s barely worth you reading my support tickets and it’s almost with me just not using it while paying for it.
Thoughts?
I suffer enough with legacy code created by junior programmers that long left the company. I imagine how much more fun will be to work with this type of code.
* I know Copilot is not capable of creating full systems yet but it is a matter of time before they evolve it to generate all the bolierplate code for you based on some comments you make or, even worse, some UML abstraction!
I joined the Copilot beta and where it has helped me most is: 1) Ideas 2) Filling in the broad-strokes.
It's name does not deceive, it is only a co-pilot. It will not tell you where to go, it will push you in the right direction and let you focus on the more difficult parts of a task.
I signed up for it because at $10 per month the keystroke reduction is somewhere in ballpark of 70% or more. That's the real value in my use-case.
If most code is "bad" code (any definition works) and this AI was trained on all/most code on GitHub, does that mean that this AI mostly helps to produces bad code?
if (x.length < 10) throw
And it figures out the rest. So while sometimes it encourages bad code, when you know how to use it well, it helps you write the good things I'd normally be too lazy to write10$/Mo. Is way to much for what you get.
I mostly write js/ts code.
The suggestion feature / auto-complete feature is wonky at best and leads to bugs or just bad code in the worst case.
Even when you write comments or have a function like `addOne` and you want to add `subtractOne` it will not get it right a lot of times.
Then you have the cases were it throw 50 or more lines code at you for something very simple.
Catching errors or error handling is basically non existing.
I tried it for writing tests. It bad. It does not help at all.
I uninstalled and after some hours of work I don't really miss it.
But I would be interested in me picking 2 of those 3 for me to do, and the AI can do the third for me. So if I love coding and test writing but don't like documentation, then the AI can do the third leg for me.
I think that the quality of results from the AI would be much better than what Copilot is capable of. Even if I focussed on test writing and documentation, I think that the AI should be able to write decent code based on those two inputs.
I get to a "Confirm your payment details" screen, but there is no further action I can take (ie: no button to press or link to click to "confirm"). It does say "You will be billed $100/year starting August 20, 2022" -- but when I view my "settings", it tells me I haven't signed up for copilot.
I tried various browsers, including Edge on Windows 10 sans plugins (the combination I would expect to be the most supported for MS owned github.com).
There would be no issue if they trained the model on Microsoft's closed source instead.
Given that the cost of a software engineer's time is so high, $10/mo. seems very reasonable if Copilot saves you more than that in time per month. So in a vacuum assuming all dollars are spent with equal productivity, if I take the equivalent of $1000/mo. in time writing boilerplate, and I can reduce that to even $989 with Copilot, it becomes a good deal.
Which means it's going to be harder to evaluate junior candidates code without actually running and testing that code because they'll have built a huge library that looks really well formatted but has logic gaps which are difficult to catch on a glance. As it stands currently, usually some style and organization tells you this person gets it.
>What data does GitHub Copilot collect?
>...
>User Engagement Data
>...
>Code Snippets Data
>Depending on your preferred telemetry settings, GitHub Copilot may also collect >and retain the following, collectively referred to as “code snippets”: source >code that you are editing, related files and other files open in the same IDE or >editor, URLs of repositories and files paths.
It's possible to opt out, but it's not disabled by default, and this code snippets might be very sensitive.
At best, it is scary in it's ability to pre-emptively suggest context-specific implementations of functions before I have even considered what I might need to do. It probably helps that I am very particular about how I name variables, which seems to help copilot infer my needs.
But at least a couple of times a day, I am blown away by it.
Will be paying for the public version, absolutely worth the money in a single day's coding alone.
https://github.blog/changelog/2021-10-27-pull-request-merge-...
No bulk licensing of Teams? This makes no sesnse, so if a team wants to make Copilot part of their official tools, each member have to purchase this individually. Thats a huge PITA
That said, launching a dev tool without Orgs integration seems dumb. I work for a FAANG and so can’t use this professionally. It’s a totally different price calculation for “programming as entertainment”. Is this worth more than Netflix to me?
Having a type checker is critical for this, though. When I code Ruby I’m much more skeptical of Copilots suggestions.
No.
Is it particularly smart?
Also, no.
But it really speeds up all the dumb stuff in coding. Especially UI code can be very chatty, and Copilot is a nice assitance here.
Also, it would be cool if it was part of GitHub Pro, which I'm already paying for, haha.
Would be happy to pay for it (or expense it to my employer) if I was still an IC.
HackingWithSwift shows how this process gets rocket wings with Swift[0] (skip to the end for the mind melter)
[0] https://www.hackingwithswift.com/plus/high-performance-apps/...
Copilot is far better.
It understands what I’m trying to do, and do it for me.
Preferably, I would like to try it in Vim. But anything that I can run in a container would be ok.
Tabnine, a similar competitor, explicitly mentions this on their website:
" Tabnine only uses open-source code with permissive licenses for our Public Code trained AI model (MIT, Apache 2.0, BSD-2-Clause, BSD-3-Clause). "
Other commenters here say the completion quality is worse than Copilot. I use Tabnine for local short completions only and am quite happy with it. Didn't try Copilot yet.
Update: It seems like they check whether the code it emits matches the training set and if it does it won't suggest it.
Do people want to see ads instead?
Or would they be ok paying for Google search (another service trained by all the information we willingly volunteer to them)?
Copilot adds tremendous value and they are justified charging for it.
Extension activation failed: "Unexpected end of JSON input"
Copilot saves me from leaving my IDE for a large amount of situations. It saves me from opening a new tab (tab #1003) and Googling my problem, finding a solution on StackOverflow, scrolling down to the answers, curating the best answers, picking the one I like, copy/pasting it, then tailoring it to my liking (JS to TS, naming conventions, etc.) and testing it.
I'll miss it for personal stuff but I'm not paying $10 a month just for my personal projects at home.
To paraphrase: "sure it's minblowing and the biggest productivity gain in years, but I want it FREE".
Yes. You got used to it being free. And now it's not. But $10/mo is a steal. It's more than fair and far, far less than they could get.
And no. They don't owe you anything.
In fact, they probably host your code (often free), and less directly provide your IDE (for free). So this idea that they owe you something needs to be reassessed.
CoPilot is easily worth it and I think this is fair. I actually welcome it because I was nervous it might be like 80.
TLDR: Tabnine advantages vs Copilot 1. Can run locally 2. As-you-type suggestions (mid-line) 3. Private model based on your code 4. Free plan available
Read more at https://tabnine.com/tabnine-vs-github-copilot
# use numba to speed up the accumulation of the moving average
@numba.jit("float64[:](float64[:], float64[:])", nopython=True, nogil=True)
def moving_average(x, a):
n = len(x)
y = np.empty(n, dtype=np.float64)
y[0] = x[0]
for i in range(1, n):
y[i] = y[i-1]*a[i-1] + x[i]*a[i]
return y
I would have found it with a stack overflow search but it gave me this after I just typed : # use numba to …Probably the best autocomplete I’ve ever used across multiple languages but it’s not reliable at all for the more complex tasks that their marketing makes it seem it’s good at.
Usually, I suggest that my team start with the user value and experience, but for this specific comparison, it’s essential to start from the technology, as many of the product differences stem from the differences in approach, architecture, and technology choices. Microsoft and OpenAI view AI for software development almost as just another use case for GPT-3, the behemoth language model. Code is text, so they took their language model, fine-tuned it on code, and called the gargantuan 12-billion parameter AI model they got Codex.
Copilot’s architecture is monolithic: “one model to rule them all.” It is also completely centralized - only Microsoft can train the model, and only Microsoft can host the model due to the enormous amount of computing resources required for training and inference.
Tabnine, after comprehensively evaluating models of different sizes, favors individualized language models working in concert. Why? Because code prediction is, in fact, a set of distinct sub-problems which doesn't lend itself to the monolithic model approach. For instance: generating the full code of a function in Python based on name and generating the suffix of a line of code in Rust are two problems Tabnine solves well, but the AI model that best fits every such task is different. We found that a combination of specialized models dramatically increases the precision and length of suggestions for our 1M+ users.
A big advantage of Tabnine’s approach is that it can use the right tool for any code prediction task, and for most purposes, our smaller models give great predictions quickly and efficiently. Better yet, most of our models can be run with inexpensive hardware.
Now that we understand the principal difference between Microsoft’s huge monolith and Tabnine’s multitude of smaller models, we can explore the differences between the products:
First, kind of code suggestions. Copilot queries the model relatively infrequently and suggests a snippet or a full line of code. Copilot does not suggest code in the middle of the line, as its AI model is not best suited for this purpose. Similarly, Tabnine Pro also suggests full snippets or lines of code, but since Tabnine also uses smaller and highly efficient AI models, it queries the model while typing. As a user, it means the AI flows with you, even when you deviate from the code it originally suggested The result is that the frequency of use - and the number of code suggestions accepted - is much higher when using Tabnine. An astounding number of users accept more than 100 suggestions daily.
Second, ability to train the model. Copilot uses one universal AI model, which means that every user is getting the same generic assistance based on an “average of GitHub”, regardless of the project they're working on. Tabnine can train a private AI model on the specific code from customers’ GitLab/GitHub/BitBucket repositories and thus adjust the suggestions to the project-specific code and infrastructure. Training on customer code is possible because Tabnine is modular, enabling the creation of private customized copies. Tabnine "democratizes" AI model creation, making it easy for teams to train their own specific AI models, dramatically improving value for their organization.
Third, Code security and privacy. There are a few aspects of this. Users cannot train or run the Copilot model. The single model is always hosted by Microsoft. Every Copilot user is sending their code to Microsoft; not some of the code, and not obfuscated - all of it. With Tabnine, users can choose where to run the model: on the Tabnine cloud, locally on the developer machine, or on a self-hosted server (with Tabnine Enterprise). This is possible because Tabnine has AI models that can run efficiently with moderate hardware requirements. This means that, in contrast to Copilot, developers can use Tabnine inside their firewall without sending any code to the internet. In addition, Tabnine makes a firm and unambiguous commitment that no code the user writes is used to train our model. We don’t send to our servers any information about the code that the user writes and the suggestions they’re receiving or accepting.
Fourth, commercial terms. Microsoft currently offers Copilot only as a commercial product for developers, without a free plan (beyond a free trial) or organizational purchase. Tabnine has a great free plan and charges for premium features such as longer code completions and private models trained on customers’ code. We charge a monthly/annual subscription fee per number of users. All our plans fit organizational requirements.
Philosophically, Copilot is more of a walled garden where Microsoft controls everything. Copilot users are somewhat subjects in Microsoft’s kingdom. Tabnine’s customers can train the AI models, run them, configure the suggestions, and be in control of their AI.
In sum: both products are great; you’re welcome to try (Tabnine Pro) and see which one you prefer. for professional programmers, Tabnine offers in-flow completions, the ability to adapt the AI to their code, and superior code privacy and security.
For those who want to try Tabnine Pro, here’s a coupon for one month free https://tabnine.com/pricing?promotionCode=TWITTER1MFREE
Also, here's a detailed comparison table of Tabnine vs Copilot https://tabnine.com/tabnine-vs-github-copilot
I wouldn't go that far. It's a pretty big help in repetitive/boiler plate code and it's pretty good at intelligently transforming data, but I've found it gets in the way more often than it helps for every other case.
I would also not go that far.
Having good auto completion because of Typescript for me is the way way way bigger productivity gain.
Isn't that up for us to decide?
For work yeah sure I have no problem.
But I've been using it at work and home and my hobbyist projects are hardly worth paying $10 a month to use it. So in that context it's pricey. That's not "entitlement" that's just the value of the product to me.
That's not how I would paraphrase most of the comments here. At least the ones I'm seeing are closer to: "it's really neat as far as free demos go, but ultimately is not that useful and not worth paying for."
My current prediction is that this coming recession and the increasing cost of money is going to lead directly to a new AI winter. This almost goes without saying for the mountains of useless ML projects being churned out by DS teams in companies big and small. However, even for this very expensive well staffed projects, there's still a gap between amazing demo and game changing product that none of the recent AI projects have been able to close. After billions poured into these demos, in the past 10 years very little of daily life has been impacted by AI and in 10 more years even less will since companies will stop forcing useless AI projects on customers.
As someone with a lot of experience in ML/DS, I would recommend everyone in this field start thinking about how to reimagine your resume for something else. There's going to be a massive contraction in this space once the cheap money starts flowing.
Microsoft is selling AI services based on training data they don't own and didn't acquire rights to, nobody writing the licenses of the code it's using had the opportunity to address this kind of code use without license, attribution, or consent. (and the training data is a huge part of the value of an AI product)
I agree, but it still uses resources and those don't come for free (hardware, electricity, cooling, maintenance staff, housing, etc.)
It's really difficult to assign monetary value to all these aspects and weighing them against each other in a fair manner.
The consent issue is a difficult legal aspect as well. Github's ToS Section D.4 clearly states they retain the rights to process your content and
parse it into a search index or otherwise analyze it on our servers
It can be argued that using the content to train an AI model falls under "analysing it on our servers". Also It also does not grant GitHub the right to otherwise distribute or use Your Content outside of our provision of the Service
If CoPilot is part of their service, it's in their right to distribute the content, e.g. by means of CoPilot as a processed part of the model.GPL and other licences don't place restriction on the usage as training data. It's currently a very murky legal grey area. Licences need to adapt to this new form of usage pattern.
Damn, rip Google.
I'm enjoying reading some comments where people consider how much it's actually worth for their usage. Dollars brings some sober analysis. I'm sure the development and compute have a significant cost, and should be paid for.
I have loved using it, I've had several moments where I had to stop typing to lookup a formula for something, and a few seconds later it provides the correct formula. Gives me those warm fuzzy feelings emacs used to give me.
/snark! I think it'd be great if AI could tag its sources and distribute money accordingly, but I expect some perverse incentives to pop up in doing so...
Because, if they don't pay these folks... I mean, who does that hurt? The concept of intellectual property exists to incentivize creating valuable art/literature/code. In theory at least, we agree to uphold IP laws because we recognize that more value gets created when they're a state enforced monopoly on the person who came up with that piece of art/literature/code.
But we also recognize that sometimes these laws go too far; eg that there are patent trolls and corporations fighting public domain and game publishers going after anyone who makes a let's-play of their video.
In those case, it's reasonable to think the world would be better off if we all shrugged and told the IP holders "too bad, someone else is going to create value off your work and you're not going to get a cent from it, we just think it's not worth building and maintaining a nightmare bureaucracy just so you can tax them".
And from that point of view... Copilot is fine? It's not like the people posting code on Github or StackOverflow were thinking "I'm only doing this because I know a future AI 10 years from now won't scrap the code I wrote to train a neural network to create a code completion engine". Yeah, yeah, this breaks the spirit of the GPL and Stallman's vision, etc, etc.
But... I mean, at some point, you got to stop debating semantics and wonder what we're coding for. What Microsoft has created is a tool that can collectively save developers billions of man-hours. It's a net good for humanity. As far as I'm concerned, the fact that this net good was developed is infinitely more important than the fact that Microsoft didn't pay royalties to a nebulous amount of developers who wouldn't have noticed anything if Microsoft hadn't developed Copilot.
tldr MIT license is great, piracy is great, fanfiction is great, screw the very concept of intellectual property.
I use the vim extension for vscode which is great.
In general learning the tools we already have I would say has for now a greater impact on productivity then Copilot.
It was not published to be freely reproduced without adhering to licenses, etc.
You don't need to acquire rights to read a newspaper (other than say, paying a dollar), you do need rights to copy articles and sell them.
once you've written a few lines of code as part of a larger project, is the rest of the world prohibited from writing the same code unless they agree to the terms of your license?
If you want to make a point about things that incidentally match making people who independently reinvent the same thing, you're criticizing the function of software patents, not copyright.
About deploying the model - it just needs to filter out verbatim exact snippets so it only outputs original, unattributable code. That can be done by hashing ngrams and a bloom filter. The vast majority of code generated by Codex is original anyway.
By the way, Codex is good for many other tasks, like, parsing the fields of a receipt, or extracting the summary of an email, or generating baby names, it's an all purpose NLP tool. Just call it like a function. Code completion is just one thing it does. It talks pretty great English, can compose poems.
That's a setting now.
Copilot isn't honoring the license, so why does it matter whether it was under a restrictive or permissive license?
I don't think it's really that murky, these models contain and have been shown to reproduce copyrighted code with the right prompting, it's not a grey area it's just obfuscated theft.
seems to me anyone agreeing to the ToS should expect their code to show up on other peoples screens as search results
really the question is a matter of degree, is copying your nested for-loop iterating through a row oriented matrix really a unique piece of code protected by copyright? Or does the copyright apply to the file you've written as a whole, leaving room for me to accidentally use words in the same order? clearly there is a tipping point between writing code that looks like yours and using the code you've written outside the terms of your license, we will have to wait for courts to decide where that line is for all ML, not just co-pilot
also copying is not theft
When I reproduce code based on something I looked up, I do indeed have to be careful not to explicitly copy sizable chunks, somethings are obvious and the only way to do things, but not everything.
What users and copyright holders expect from humans does not automatically apply to marginally similar situations with computers and ML applications. For example: if I'm walking down the street I don't mind at all if someone recognizes me or a stranger remembers seeing me later, I'm actually rather bothered if someone (or the state) is running facial recognition software and recording every time it see me or anyone else walking down the street.
Copilot is different - it clearly takes a lot of skill and effort to turn a bunch of GitHub repos into a fancy autocomplete system.
How well does copilot help with languages like Elixir that are less common? WIth TypeScript it's been remarkable, but that's one of the most popular and surely very familiar to devs and GH, so I would expect less popular like Elixir to not perform as well.
Does copilot work for shell scripts?
I'm a vim person and don't want to use VS code. Is copilot worth the hassle to get installed into vim?
Copilot did pretty poorly when I tried using it with Julia- it kept suggesting Python code. I suspect it would do something similar in Elixir.
I'm also a vim person who doesn't want to use VS code, but I've gotten more than enough value to get into my first IDE (with vim keybindings). A lot of tedious C++ code is getting correctly auto-generated.
Sometimes, with variable results. I think I've only observed it guess patterns from the current directory
> Does copilot work for shell scripts?
Yes, it gave me this earlier today while editing my .zshrc:
# kill a process on a given port
killport() {
lsof -i :$1 | awk 'NR!=1 {print $2}' | xargs kill
}Oh wow—a language where there are: 20 ways to do something, three of them are common, but only three others actually behave, by any standard, correctly, while being among the least-common in public code, seems like exactly the wrong kind of thing to use this for.
Shell doesn't need machine-learning autocomplete trained on existing shell scripts, it needs a hand-built aggressive linter.
Something like https://www.shellcheck.net/?
My primary usage is shell scripts, as it seems to struggle on complex code, while shell scripts are typically a lot of simple code.
No, it cannot make me write code I couldn't write before. It does not autopilot and does all the coding by itself. But it still boosts my productivity greatly, making me relaxed while coding and focusing on the important part rather than errands.
Also if there are some repetitive sections of code I need to bang out quickly this will auto fill that repetitive pattern (although I'd argue this is usually a sign that the code should be cleaned up)
I avoid letting it fill in large swaths of code though. I have no idea where that code is coming from (license infringement?) and it tends to go way off the rails.
Additionally I feel that it makes me a worse programmer if I allow it to take over too much.
I've been programming for 20 years (more if you count my time as a kid) and have a certain flow. Part of that flow is the natural pause between thinking of solutions and typing. When the computer is beating me to the typing portion (and often times making mistakes) I would find myself doing more code review than code writing. Sometimes a few bugs popped up and it was thanks to copilot (or was it me failing to correct copilot's mistakes?).
I found my brain sort of switching into a different mode. Rather than thinking about my next steps I was thinking about the steps the computer just took and how I needed to clean them up.
Rather than the AI being my reviewer during a paired programming session, I was the computer's reviewer.
So now, like I said I use it very sparingly.
I will say that I'm not averse to change and do appreciate the new tools that we have available to us - Starting on a x386 writing QBASIC as a kid to using Jetbrains Rider is an indescribably different experience.
That said, I'm not ready to move to the backseat and let the computer take over yet. In small doses copilot is fine, but I wouldn't lean heavily on it for large projects or to do the thinking for me.
It's called OpenAI Codex. https://openai.com/blog/openai-codex/
It is very useful for things that I would call boilerplate, e.g. you have almost duplicated code (say in a view and a controller) and need to copy from one to the other.
It is annoyingly bad for autocompleting an api as it tends to be slightly (and plausibly) wrong.
I haven't found it very useful for anything else.
Working on a project where I have to do lots of the first makes me sad, so I tend to try to avoid those projects - but if I was forced to for some reason it would be worth $10 a month. However, if enough of the programming I did could be helped by github copilot for it to be worth that much I would start to get worried I was working on the wrong sort of problems and try to move into something different.
I can't use it to generate longer chunks of code like methods or functions, because it will do it a bit wrong and I loose time to correct it.
It can somehow generate correct and fitting code, but it takes multiple tries and writing comments in which you describe exactly, with lots of details what you want to do. At that point I'm better off writing the code myself.
However, if the method should be small like VerifyIfNumberIsEven, it does a good job.
Probably I would pay 10$ for it.
I recently started a 100% Common Lisp job and it does not work nearly as well for Common Lisp. A lot of generated code is Emacs Lisp.
Two months ago I would have signed up for a payed account with no hesitation, but I need to re-evaluate it with Common Lisp again. BTW, I happily pay OpenAI for GPT-3 APIs instead of using it for free. For NLP work, OpenAI's APIs have high value to me.
Definitely does not seem worth paying for me to end up more stressed out, haha.
The IntelliJ Copilot plugin became worthless just before the release. It borks up the formatting and requires almost more keystrokes to make the code work than it saves.
It sometimes works brilliantly, the result has almost always been either duplicated code which could use refactoring or simple minded attribute access code which could be solved generically. I have the fear that it will push developers to go the "easy route" and not think about the code too much while churning out more and more lines of generated code, so I'm unwilling to recommend it to junior developers.
However I wish there was more competition. Github could rescind access to Copilot or charge $40/mo or it could slow down because their cloud is overloaded with new users, and I would be out of luck.
Tabnine and Kite are alternatives but I've heard they don't work nearly as well. I wish there were similarly-effective alternatives which charge similar rates for cloud hosting / profit, but open-source their datasets and algorithms, and just generally provide a fallback if Copilot's quality ever goes down.
Since this is derived from code Microsoft did not write, or ask permission to use, it should be at the very least free to use.
"They gave away all their code, so we packaged it up and sold it right back to them, the stupid bastards!"
It’s also awful that they took free code (open-source), and now they want money for it. Make it open-source and free to use…
Some say it’s great for repetitive tasks, but if you write repetitive code (tests also) maybe you should look for other solutions than “auto-generating” unmaintainable code.
.filter(table.deleted==False)
nothing complicated, but one tends to forget it. So i got into the habit of starting a new line in whatever query I am building and see what copilot thinks I forgot.That way it can capture code in a library instead of having thousands of developers copy/paste the same code snippets.
Every human was just trained in the same way. Why isn't this a problem for every human?
I really don't see the difference. One is an artificial neural network while the other is a biological neural network?
In my view: I don't believe a machine (at least not any we're capable of creating) can truly learn.
Copilot is a machine working on its inputs. Humans think and create. Maybe it can be argued that humans are just more complicated machines, but I don't think most people would agree with such an equivalency.
Copilot is constructed almost entirely from others' code. There's a tiny fraction of original "ai glue" in there, but the end product is arguably a derivative work of all that code it was trained on. As is its output.
It can also be argued that the AI part is really just an obfuscating copy machine. One that was created specifically for that task.
And of course, the real killing blow: if/when it reproduces training code verbatim, and you don't notice... will "copilot did it" be a valid defense in court? There are different opinions on that I guess, but no one knows for sure -- and I wouldn't take that risk.
Here GitHub (Microsoft) is charging for a product that in certain circumstances violates copyright.
Yes, if Copilot (or a human) would copy existing code, that would be a copyright violation. But none of the arguments here are about that. It's just about the learning.
I think I myself teached copilot a lot of things about supersymmetry :)
> Do you want to start using GitHub Copilot today? Get started with a 60-day free trial, and check out our pricing plans. It’s free to use for verified students and maintainers of popular open source software.
Seems pretty clear. If you're willing to do your own research (aka going to the CoPilot site): https://github.com/github-copilot/tp_signup, you'll see that pricing reflected here as well as the date when the free period ends, which is August 22nd.
I guess that means I feel the level of expectation is in the name.
To everyone expecting Copilot to magically write the code they are thinking about - you are missing the point. There is a learning curve of using this service that allows you to be more efficient in expressing your ideas. It's not about doing all the work for you. It's like auto-complete on the next level.
Licensing concerns - oh come on.. what is the big deal? There are millions of "for (int i ..)" loops out there. Like anyone gives a damn about 5 auto-generate lines being _probably_ copied from somewhere. Moreover, if you used Copilot just a bit you would know that is not how it works.
“GitHub Copilot is optimized to help you write Python, JavaScript, TypeScript, Ruby, Go, C#, or C++.”
https://docs.github.com/en/copilot/overview-of-github-copilo...
I'm pretty sure I have done that by accident already in the past without noticing. This is not so unusual when you write some code with very common patterns.
And then this even can apply for code which you have not seen before. E.g. write some bubble sort function. Very likely you will find exactly the same code online.
https://www.warp.dev/blog/replace-git-cheat-sheet-ai-command...
:() { :|: } :&
Thanks copil[user disconnected].Yeah, it makes mistakes, sometimes it shows you i.e. the most common way to do something, even if that way has a bug in it.
Yes, sometimes it writes a complete blunder.
And yes again, sometimes there are very subtle logical mistakes in the code it proposes.
But overall? It's been *great*! Definitely worth the 10 bucks a month (especially with a developer salary). :insert shut up and take my money gif:
It's excellent for quickly writing slightly repetitive test cases; it's great as an autocomplete on steroids that completes entire lines + fills in all arguments, instead of just a single identifier; it's great for quickly writing nice contextual error messages (especially useful for Go developers and the constant errors.Wrap, Copilot is really good at writing meaningful error messages there); and it's also great for technical documentation, as it's able to autocomplete markdown (and it does it surprisingly well).
Overall, I definitely wouldn't want to go back to writing code without it. It just takes care of most of the mundane and obvious code for you, so you can take care of the interesting bits. It's like having the stereotypical "intern" as an associate built-in to your editor.
And sometimes, fairly rarely, but it happens, it's just surprising how good of a suggestion it can make.
It's also ridiculously flexible. When I start writing graphs in ASCII (cause I'm just quickly writing something down in a scratch file) it'll actually understand what I'm doing and start autocompleting textual nodes in that ASCII graph.
I forget a lot of things, simple dumb stuff like type conversions or specific keywords spelling. Copilot takes care of 99% of that so I can focus on my higher level spec.
If anything sometimes it's too agressive. I start typing a word and it's already building up the rest of the application in a different direction...
I've had this experience too. Usually it's meh, but at one point it wrote an ENTIRE function by itself and it was correct. IT WAS CORRECT! And it wasn't some dumb boilerplate initialization either, it was actual logic with some loops. The context awareness with it is off the charts sometimes.
Regardless I find that while it's good for the generic python stuff I do in my free time, for the stuff I'm actually paid for? Basically useless since it's too niche. So not exactly worth the investment for me.
I find I spend my time reviewing Copilot suggestions (which are mostly wrong) rather than thinking about code and actually doing the work.
There's a ton of developer effort that went into Copilot and those devs should be paid fairly. But the majority of what fuels Copilot is the millions of lines of open source code submitted every day.
I think I'd feel a lot better about it if they committed a good chunk of that money back into the same open source communities they depend on. Otherwise its a parasitic (or at least not fully consensual) relationship
But! I think the savings are even bigger because there is no context switch. If I have my browser open I might find myself going to hackernews, checking my email, looking at my stackoverflow notifications, browsing twitter - whatever. Copilot is not only faster, it keeps my focus on the code without giving me a chance to get distracted. In some sense, for this example, Copilot saved me 5-10 seconds by not needing to Google something. In another sense, it might have saved me an hour because I didn't decide to just check something on twitter while I had my browser open.
Like, writing a really boring unit test might only take 60 seconds, but if Copilot can do it for me (even if I have to quickly scan it for correctness) that saves me… well something other than 60 seconds. It sure feels like a big deal.
It wasn't much help in designing/implementing classes in Java or .NET, but when it came to implementing unit tests it practically wrote everything after i named the class and designated it as a test. It was able to extract all the different methods from the classes being implemented, and create appropriate unit tests based on that.
Now, it was school homework, so not representative of a complex business application, but if i can just handle the basics/boilerplate, it would be worth it.
Assuming a (European) work week of 37.5 hours per week, $10 comes down to $0.06 per working hour, and if it can just save me 5 minutes of work every day it will be worth it.
For example, I just had to convert some OCaml code to Rust. I wrote the first few conversions, and then I would just paste the OCaml code in a comment, and it would auto-complete the equivalent code in Rust. I just had to check that it was correct, which it was most of the time, rinse and repeat and wow. One would have to be blind to not find copilot impressive, really, it's the future.
And no, I'm not a beginner. I'm a principal with over 20 years experience. I don't really use it for the Stack Overflow-type stuff, but even as an autocomplete it's worth the money. As it happens I'm apparently eligible for free access as an open source maintainer, but I'd pay $100/year in a heartbeat for it. I'd pay for Intellisense if that was $100 too.
I’m curious whether we’re talking 5% or 30%.
No it doesn't "understand what I'm doing" or "get everything right" but that's hardly the point
It's often reducing the amount of labor I'm doing by hitting the keyboard by guessing 90% correctly what I was going to type
It also often saves me from having to google how to do something, it's effectively serving me a search result right along my code
I'm lucky to be getting it for free but would have immediately paid $10. It needs to only save you minutes a month for that to be worth it
Also the comments about it being "unfair their monetizing other people's work" are missing the point.
Github has created a product that many people use and through that effort created a large repository of code.
They are now releasing a product that is going to create a large amount of of value in time saved and are maybe capturing 2% of that. This is a great outcome for everyone
- It's an amazing all-rounder autocomplete for most boilerplate code. Generally anything that someone who's spent 5 minutes reading the code can do, Copilot can do just as well.
- It's terrible if you let it write too much. The biggest problem I've had is not that it doesn't write correctly, it's that it think it knows how and then produce good looking code at a glance but with wrong logic.
- Relying on its outside-code knowledge is also generally a recipe for disaster: e.g. I'm building a Riichi Mahjong engine and while it knows all the terms and how to put a sentence together describing the rules, it absolutely doesn't actually understand how "Chii" melds work
- Due to the licensing concerns I did not use CoPilot at all in work projects and I haven't felt like I was missing that much. A friend of mine also said he wouldn't be allowed to use it.
You can treat it as a pair programming session where you're the observer and write an outline while the AI does all the bulk work (but be wary), but at what point does it become such a better experience to justify 10$/mo? I don't understand if I've been using it wrong or what.
> body: `text=${text}`,
So it breaks if the text contains a '&' and even allows parameter injection to the call of the 3rd party service. Isn't that critical on a sentiment analysis API, but could result in actual security holes.
I hope the users won't blindly use the generated code without review. These mistakes can be so subtle, nobody even noticed them when they put them on the front page of the product.
1) You should have managed the expectations of the users in a better way. Tell them it will become a paid feature from the begining, so nobody gets surprised 2) The way everyone unsderstood this today was too aggresive. An infinite warning in visual studio saying "hey, i've stop working, please sign up and pay or uninstall me". Too violent.
A "Hey, we are happy you're using Copilot. We want to inform you that in 2 weeks we will close the beta and we will need you to sign up. But don't worry, it will be free for 60 days"
I'm sure 99% of people here would just be happy to pay those 10usd/month
GitHub Copilot could not connect to server. Extension activation failed: "User not authorized"
So now each individual developer using it for work suddenly has to either pony up $10/month or figure out how to expense it.
And now those devs are going to have to go to their boss and explain all the ways they’ve opened their company up to liability?
This should be hilarious.
Copilot is such a marvel though. I think they could have gotten away with it if they did like you say and give more of an advanced warning.
What is your time worth? You should easily get $60/hr, so you need to save 12 minutes per month to make it worth. I would pay that for all my employees.
CoPilot is not a replacement for writing code, but it’s incredible useful when you are stuck and or / write simple logic.
Often I don’t have the right method, function or logic on mind. Before I google, I write a comment of what I want and 8/10 CoPilot generates the right code.
Typing the comment, checking the solution, reformatting it is <<< less time than without it.
To me Github CoPilot is a standard part of my IDE and I wouldn’t want to miss it anymore. It saves me at least an hour a day of coding. Some stuff is really crazy. I invite you all to try to be open-minded. You have to experience it.
// You have to code for yourself
I don’t really like this argument, because if that argument would be true, we would also need to now how our codes translates to 1 and 0s and how the electronics build our application than. AutoComplete is part of our life on our phone and it can be with developing. Don’t make it harder as it needs to be.
It's incredible that we're able to do these things but awful at the same time since this data was / is not theirs. Same as something like Dall-E.
...and not compensating (or even attributing as required by the licenses) the authors for it.
Copilot learns the "shape" of code. Common patterns and algorithms, etc. You can't copyright an algorithm.
Also feels kind of icky to train on open source projects and then charge for the output.
Then the Internet and Google came around. I found that instead of me maintaining those code snippets, I could search in Excite/Altavista for how to do something, and it will be stored there for me. Later came sites like StackOverflow (expertssexchange before it) which concentrated much of that information which before was scattered in PHPBBs and Geocities pages.
Now I see this Copilot app like the evolution of that; Instead of having to manually go searching for a snippet, I imagine I can "pull it" almost automatically while I am writing code, with an AI helping me search for the right snippet with the current code context.
That doesn't sound bad at all.
Nevertheless, I haven't used it because I DON'T want my code to be sent to Microsoft or any other company. And I don't believe in adding random code for which I don't know the license! What if there is some code which was AGPL that Copilot happens to use? that's pretty bad.
How? I was in beta but looks like I'm kicked out. I also verified my student status but get prompted to pay. Are you a maintainer? Have you verified that you have access?
Copilot is marketed as a pair programmer but the code quality is often times just wrong, not just bad. It thinks it understands what I want based on the function name and parameters but the generated output is no where close to what I want.
Multiline AI generated suggestions are not a good idea anyway (not yet at least). AI based LSP/auto completer would be much better at this stage with a lot faster DX.
I've been in the beta since almost the beginning I have not really seen much improvement on the frontend side. Since its release, the changelog only mentions 10 small (or so it seems) improvements
https://marketplace.visualstudio.com/items/GitHub.copilot-ni...
On the backend side, I feel like I've started to "figure out" copilot a little bit. One thing I'd like to see is inline completion which I think gpt3 can do now but copilot which I believe it's based on cannot.
I think I will pay to continue, but I'd like to see some frontend improvements and maybe some backend alternatives. Ideally I'd love this to be open source but compute power doesn't seem feasible (?) unless we start magically crowd sourcing our computers to run a model somehow.
EDIT: looks like I'm getting it for free because of my contributions to open source o.o dope!
Hope though, when AI is becoming increasingly useful and seamlessly integrated, they not gonna take an arm and leg for it. It's just gonna be way too good to pass, people won't really have a choice but pay.
I’m certain Copilot gives me more than a 2% productivity boost. That’s a conservative estimate (I wouldn’t be surprised if it’s more like 10-15%). If you consider 2% of what a developer makes each month, it comes to a lot more than $10. I’m not based in the US, but Levels.fyi suggests it’s not unusual for devs there to make $200K/year, that would mean $16.6K/month, 2% of which is $333. Maybe that’s a bit reductive, but the point is, $10/month is negligible if it gives you a noticeable productivity boost on a developer’s income.
And by the way, I don’t particularly love using Copilot. It can be annoying now I’m over the honeymoon period. But I think it’s pretty clear it speeds me up by a noticeable margin, and time is money.
I have also taught classes and provided mentoring and support to new people up and coming in both programming and infosec. I would argue that as an open source maintainer I am actively contributing back and compensating those other developers. Unlike Github Copilot I am not selling the things I was taught, I am freely making it available to others.
It feels very icky that Github now gets to sell what it learned from my code base, when it has already been shown to replicate code with a 100% match, versus learning how to build on top of ideas or finding novel solutions to problems.
It's nothing I couldn't do myself, but just makes my job that much easier and quicker
I am sure some of the typical cynicism here will turn this into a protracted argument of "well maybe you shouldn't be a shit developer and you would be able to fit all the complexity in your head" but whatever.
Unfortunately, there is no way I can hope for that much…
So yes, please, take my money and do All my boilerplate lol
When they keep giving out freebies (VSCode, npm etc.), I never know which direction the product is going to evolve (e.g. unnecessarily tight integration with Azure).
With this, there's at least direct alignment between end user & the product.
HN can set itself apart from Twitter and Reddit by celebrating great achievements rather than tearing them down.
Copilot stands on the shoulders of open source, yes. So do many of our personal and commercial projects. Copilot benefitted from having beta users. That relationship went both ways.
A big thanks to the Copilot team for letting us be a part of the beta. I will happily pay $10/m for this.
agree !!! as any burgler-thief-attorney will tell you, it is *totally worth it*Given the cost of single GPT-3 codex query, it's very likely that Microsoft/Github is still taking a huge operating loss at 10USD per month.
It's definitely not perfect, but it's worth the price to me and if I can pay and help the product improve, it's a no-brainer.
improved speed and reduced cognitive burden
Just these two justifies the $10/month , despite of the ten other dozen drawbacks of Copilot.
It also saves a ton of time having to look up small pieces of syntax, I've taken to writing a lot of quick one-off scripts because copilot does a fairly decent job of generating code for the relatively simple individual steps.
The OpenAI threads are the exact opposite: The do not seem organic at all. Of course users probably do all the flagging, but it still gives a bad impression.
It's true that we moderate HN less, not more, when YC or a YC-funded startup is the topic, but (a) Github isn't one, and (b) we can't do any sort of moderation (less or more) on posts we don't see. I didn't see yours until now.
Re OpenAI threads - I'm not aware of anything non-organic going on there. As far as I can tell, HN users are just really interested in AI related stuff. Same for Deepmind threads, etc.
(Btw, although it's common for commenters who break the site guidelines to confer honorifics like "wrongthink" on their own posts, you don't need to resort to that to understand why users flagged your comment.)
The more experience I get with GPT-3 type technologies, the more I would never let them near my code. It wasn't an intent of the technology per se, but it has proved to be very good at producing superficially appealing output that can stand up not only to a quick scan, but to a moderately deep reading, but still falls apart on a more careful reading. At least when that's in my prose it isn't cheerfully and plausibly charging the wrong customer or cheerfully and plausibly dereferencing a null pointer.
Or to put it another way, it's an uncanny valley type effect. All props and kudos to the technologists who developed it, it's a legitimate step forward in technology, but at the same time it's almost the most dangerous possible iteration of it, where it's good enough to fool a human functioning at anything other than the highest level of attentiveness but not good enough to be correct all the time. See also, the dangers of almost self-driving cars; either be self-driving or don't but don't expect halfway in between to work well.
I can’t imagine how Copilot would save anything but a negligible amount of effort for someone who is actually thinking about what they’re writing.
A humorous example: https://cookingflavr.com/should-you-feed-orioles-all-summer/
Human pair programmers will signal when they're not sure about something. A code generator will not.
I once had it generate an entire interview with an author, which was so realistic I was sure it had encountered it verbatim in the training data. The interview was about one of his books. Turns out such a book didn't even exist, but GPT-3 knew real facts like the name his publisher, the names of employees there etc. and wove them into the story.
The best use I've found for GPT-3 is text summarization, it seems to do very well on that front. I think OpenAI are working on a hyperlinked interface that lets you jump to the original source for each fact in the summary.
random words ---> markov models ---> transformer ---> human writer
Inevitably, users of these kinds of models want them to produce more and more specific output to the point that they really don't want what the models produce and instead are just trying to get a computer to write stuff for them that they want. Eventually all the tuning and filtering and whatnot turns into more work than just producing the output the user wants in the first place.It's just a room of monkeys banging on typewriters at the end of the day.
Huh, that’s my experience with human-written texts and journalism in particular.
And I also write tests, which should catch bad logic.
I dismissed these concerns before I had early access.
Then, _literally the first characters I typed_ after enabling the extension were `//`, and it autosuggested:
// Copyright 2018 Google LLC
I immediately uninstalled it.https://mobile.twitter.com/mrm/status/1410658969803051012/ph...
> I am honestly flabbergasted they think it's worth 10$/mo
These two statments seem contradictory to me. Why are you using it 'non-stop' if it isn't even worth $10/month?
> The biggest problem I've had is not that it doesn't write correctly, it's that it think it knows how and then produce good looking code at a glance but with wrong logic.
I cannot rightly apprehend the kind of confusion of ideas that would provoke such a statement.
EDIT: upon careful rereading, I think I misunderstood. The intended meaning is likely closer to: the problem is less so that codepilot produces incorrect code and more so that its incorrect code appears correct at first glance.
You have my sincerest apologies. I leave this thread intact as a testament to my hair-trigger snark.
I wouldn't use it for anything other than that, so I would say it's worth honestly at max $1/month.
Most of the things it does for me I could replace with a library of snippets if I could be bothered to set one up.
Not really worth a monthly cost equivalent to, say, Disney+ - which I use tens of hours every month just by myself.
If my employer paid for it, I wouldn't scoff at it, but I'm not paying a cent of my own money for it.
I would consider it contradictory if they decided to continue using it while paying that price and unsatisfied
It's a trial run and the value isn't there for them
The completions are often trivial, but they save me from typing them by hand. Sometimes they are trivial yet still wrong so I need to make corrections, wasting some of the gained speed. In total these probably won't save me much time on a day.
However, every couple of days there is one of these cases, where it can do tedious work that really saves time and headaches.
Example: - After writing a Mapper that converts objects of type A to B, I needed the reverse. Co-Pilot generated it almost perfectly in an instant. This can easily save a minute or two, plus the thinking required. - For a scraper, I needed to add cookies from my browser into the request object. Basically, I pasted the cookie in a string, and typed `// add cookies`, and it generated the code to split the string, iterate over each cookie value and add it to the correct request field.
So if a few of these cases can save 10 minutes in a month, I feel it's objectively worth it. Then subjectively, not having the headaches of 'dumb stuff'/boilerplate feels great, and I am glad to spend my energy on the actual hard stuff. I will sign up as soon as their sign up page lets me.
The state of the sector is somewhat embarrassing. We have armies of monkeys well-paid to bang out the same Java/Javascript/C#/Python over, and over, and over...
I came to the same conclusion as you, you can see comments I made elsewhere in this thread. I'm not thrilled with it.
I've tried it with a few projects with different languages and it's not worth anything close to that $10/m fee personally.
It's OK at filling in a line here and there if it's boilerplate-type code but otherwise, it's like a beginner programmer at best.
If it were $60 yearly it'd be an auto-yes for me.
The licensing problems make it impossible to use at work so I won’t use it for that.
People need to be aware of the security risks of letting microsoft read all your code as it’s sent to the servers copilot runs on. By my lights that’s almost as big of a problem as licensing.
I've rarely found that CoPilot produces more than a line or two of accurate code. How likely is it that one would run into licensing issues with a single line of code that looks similar to something from another codebase?
While I understand the problem in principle, I am really skeptical that significant licensing issues would really come up with using CoPilot as an individual.
Let's say the average developer in the US costs 10k a month (I think that's pretty close to the real average of around 120k a year). So copilot would cost .1% of that developer's salary. I realize calculating things around "improvements in developer productivity" involve lots of fuzzy math, but it would be stupid for any company NOT to pay this if it improves developer productivity by just 1%.
Another way to think about it that I think may be more "real world": Let's say I'm CTO of a big company with 1000 software developers. Do I think it's going to be a better investment to hire another developer so I have 1001 developers, or instead use that other developer's salary to buy all the devs at my company a Copilot license?
But for some reason individual developers think that anything over $1-2 dollars a month is an exhorbitant cost.
I would not spend $10/month on a code completor for my job, because I would probably never see those $10 back in salary. I doubt the company would even notice the minor bump in productivity, say $100 a month
So the same problem ML has in every endeavor where we have a good metric of "correctness" that's distinct from plausibility, like OCR or natural language translation: very good at spitting out stuff that superficially resembles training data, and whether that happens to be right is totally accidental. Surprisingly good odds if you're working on something boring within the "bounds" of the model, sure, but also pretty likely to think that "now on sale" is a zillion times more likely to be announced on an advert than "a decision has been made to release this product (at an unspecified future date)."
it thinks it knows how and then produce good looking code at a glance but with wrong logic.
This is so accurate. I still like copilot and I might even pay for it, but I will never trust the logic. It always wrong in a way that _almost_ looks right.I’m honestly flabbergasted that anybody would think it isn’t worth $10 a month, despite its many serious flaws.
I think it is good for short lines, repeating tasks; for example when writing tests and want to assert different fields, assert string, int, etc; for these sort of lines was really good and fast.
my main problems: 1. sometimes make a horrible mistake, takes couple of minutes to understand 2. repeat the same mistake over and over 3. adding a single tab take a bit of time, had to copy & paste tab to avoid copilot suggestion!
It's an AI powered autocomplete. And honestly it's excellent at that. All I really want is an AI powered autocomplete and if a FLOSS project took up the challenge I'd happily donate $10/month to see it succeed. Especially if it meant none of the licensing concerns that come with GH Copilot
I can't justify $10 a month for it. Maybe as it improves.
EDIT: To clarify, $10 a month for personal use. We can't use it at work due to licensing, or it'd be worth that just to emit boilerplate.
If you're an engineer who is paid $150/hour and Copilot saves you 5 minutes/month it just paid for itself.
If you steal 10 lines of code from me, the damages will be the greater of:
- The benefit to you (10 minutes programmer time)
- The cost to me ($0)
- Statutory damages (probably $200)
In other words, it's very unlikely to be worth a lawsuit. The most likely outcome is:
- A legal letter is sent
- Infringing code is removed
- As good bedside manner, some nominal amount of money is transferred, mostly in some gesture designed to make the violated party feel good about themselves (e.g. a nice gift).
https://en.wikipedia.org/wiki/Google_LLC_v._Oracle_America,_...
For this content:
a nine-line rangeCheck function, several test files, the structure, sequence and organization (SSO) of the Java (API), and the API documentation.
The cost was: "statutory damages up to a maximum of US$150,000".That's an incomplete view. You're judging the value by the time it'd take to rewrite it.
The real value is in knowing what to type and why.
When Co-pilot suggests you a GPL code, it's main value is the knowledge, not the typing.
That piece of knowledge may have taken a LOT of effort from an OSS team to acquire.
Depending on the context, this knowledge would be worth millions.
Worth a lawsuit.
It adds up.
But you know what? I think we'll find that CoPilot will have magically skipped those Oracle repositories and only used code from lowly open source slaves.
I know what you mean, but silly nit pick since you mentioned “commercial” twice - GPL v3 does not prevent commercial use, it only requires copies to be open source. For someone to notice the project has copied code and not be inside the company, the code would (probably) have to be open source. So, this hypothetical is less likely to happen than your comment makes it sound.
A little further off topic, but amusing to me, is that the US government defines “commercial” software to be any software that has a license other than public domain. Free and open source software, such as GPL v3, is still “commercial” because it is licensed to the public https://www.acquisition.gov/far/part-2#FAR_2_101
More on-topic now, a small single function accidentally copied from an open source project by automated software might be considered fair use by US copyright law. https://www.copyright.gov/fair-use/more-info.html
(Edit) Oh yeah, and I just remembered that GitHub’s Terms already carve a necessary exception to whatever license you use in your project, to allow Github to host & display your code. I assume those terms already include some CoPilot coverage…? If not, and if they aren’t legally covered already (which I bet they are), then they could change the terms to stipulate that hosting code on GitHub bars people from suing over incidental amounts of automated copying. Main point here being that the GPLv3 license on your project is neither the only nor the primary license governing GitHub’s relationship with your code.
The are plenty of source available or open core projects where use of GPL-ed code would both be visible and incompatible with the licensing.
> Oh yeah, and I just remembered that GitHub’s Terms already carve a necessary exception to whatever license you use in your project, to allow Github to host & display your code. I assume those terms already include some CoPilot coverage…?
Letting github host and display the code is compatible with open source licenses but is very much different from letting third parties incorporate that code into non-open codebases.
> If not, and if they aren’t legally covered already (which I bet they are), then they could change the terms to stipulate that hosting code on GitHub bars people from suing over incidental amounts of automated copying. Main point here being that the GPLv3 license on your project is neither the only nor the primary license governing GitHub’s relationship with your code.
GitHub TOS can only possibly give them authorization from those directly uploading code to GitHub. They don't give github any additional license for code that was uploaded by a someone else (for example a mirror bot) becose that someone does not have the rights to give out that license.
And even if they write in their TOS that they can do whatever they want it does not mean that they can actually legally do whatever they want - even moreso when retroactively changing those terms.
Honest question: can a function have a license? Ie. can a function be copyrighted?
If an MIT library and a GPL library use the same function with some minor variation and I use the function from the GPL code in my commercial project, have I infringed on someone’s copyright/license? Or would be argument be that the function in question is not copyrightable as almost the same version exists licensed under MIT?
def print_harry_potter_book_1():
print("Chapter One")
print("The boy who lived")
print("Mr. and Mrs. Dursley, of number four, Privet Drive, were")
print("proud to say that they were perfectly normal, thank")
print("you very much. they were the last people you’d expect to be in-")
print("volved in anything strange or mysterious, because they just didn’t")
print("hold with such nonsense.")
print("Mr. Dursley was the director of a firm called Grunnings, which")
print("made drills. He was a big, beefy man with hardly any neck, al-")
print("though he did have a very large mustache. Mrs. Dursley was thin")
print("and blonde and had nearly twice the usual amount of neck, which")
print("came in very useful as she spent so much of her time craning over")
print("garden fences, spying on the neighbors. the Dursleys had a small")
print("son called Dudley and in their opinion there was no finer boy")
print("anywhere.")
[...]- Yes, functions are copyrighted.
- Copyrights are not patents, and providence matters. If you and I independently both come up with the same text, we both have a right to use it. If it could be proven in a court of law that by a 10^(-100,000) chance, we both wrote an identical novel, we'd both have copyrights to our work.
- Conversely, if you took my creative work, the fact that someone else came up with the same creative work isn't a defence
- If the code you borrowed went MIT->GPL->your code, things get very ambiguous. The copyright holder is the original author if the GPL code made no changes.
- For just one function, you might be able to get away with a fair use defence. There's a four-prong test, which is pretty fuzzy. You'd do well on some prongs ("the amount and substantiality of the portion used in relation to the copyrighted work as a whole") and poorly on others ("the purpose and character of the use").
- For something like copilot inserting it, you do much better, since intent matters too.
More like, everybody has their own sourdough starter. Some people sell theirs, some people give it out for free. Someone goes from house to house to make a super-starter by collecting pieces from each house. Then someone uses that starter to sell a patented bread
Ok it's not a great analogy either. Maybe we need to stop trying to reduce the complexities of digital so much
But being digital, the knife was just copied and the original owner still has one, but you also have theirs without their permission. So there's much less sympathy for the victim.
I don't want my code editor to try to up-sell me, ever.
> or assure the developer, that the inserted/suggested code is not a verbatim copy of existing code
No, it does not do that.
> How can developers be sure, that they are not violating licenses by using Copilot
There are no clear answers.
""" We built a filter to help detect and suppress the rare instances where a GitHub Copilot suggestion contains code that matches public code on GitHub. You have the choice to turn that filter on or off during setup. With the filter on, GitHub Copilot checks code suggestions with its surrounding code for matches or near matches (ignoring whitespace) against public code on GitHub of about 150 characters. If there is a match, the suggestion will not be shown to you. We plan on continuing to evolve this approach and welcome feedback and comment. """
That's a bold statement considering how easy it was for testers to quickly find examples of this in initial testing.
> against public code on GitHub
... and how some of those examples found were from code not hosted on Github.
Ultimately though, what matters here is not whether this is true but whether it's plausible enough for legal departments in companies buy it.
So to avoid a violation a developer needs to perform a mind-wipe?
If I am writing a novel and I copy a section verbatim from another novel, I am infringing on the other novelist's copyright, regardless of whether I wrote it from memory or not.
And this makes sense. For a trivial operation, there might be only one way to write the code. That's not copyright infringement, just like you're not infringing on an author's copyright by occasionally writing a sentence that was similar to theirs. For a nontrivial operation, you can easily write your own code without copying someone else's work.
Remember also that you can use others' ideas. Copyright only cares about the code itself. If there's a clever trick that you've seen someone use, you're free to use the same clever trick as long as 1) they didn't patent it and 2) you're not actually copying their code
An example for wine/proton/reactos developers from a moderator on the forum about the leaked windows xp code:
"You look at the code? You worked for MS? No dev for us! It's that easy."
https://reactos.org/forum/viewtopic.php?t=20189
There are many instances of large lawsuits where just seeing the old code made you in eligible to even touch the new code
If you draw Micky Mouse from memory, Disney still owns the copyright.
Maybe I'm using it wrong but I've hardly seen it pump out a mass volume of code.
Copyright violations are a genuine concern from the outputted code, GitHub themselves have admitted it may emit raw training data rarely.
That’s not a knock on Copilot, I think it’s a great product and I happily subscribed today after using it the last few months!
For me the bottleneck is seldom typing. And while Copilot can sometimes dish out some more advanced stuff, I still have to verify it and understand it. Since I can basically solve every problem I encounter day-to-day, Copilot's contribution is not that useful.
Does it save you $10 worth of your time within a month?
Comments here are wildly uninformed. I see comments complaining about copyright that seem to have no awareness of either fair use doctrine prior law as it relates to partial usage nor the details regarding how infrequently Copilot generates identifiable verbatim results outside attempts to auto fill empty files in empty projects (which seems outside typical usage).
Or complaints that it makes mistakes, as if 90% of those mistakes aren't immediately flagged by the linter. Not only that, but I've found that often when it does make mistakes, it reflects a consistency smell in my own code, such as tripping up on a legacy naming convention that should really be refactored out.
If it doesn't save you $10 worth of time, obviously don't use it. Personally I was worried it was going to be more given the ways in which it cuts down on the most boring parts of a high value profession.
But insinuating that someone's positive experience of the tool reflects inexperience is a weird gatekeeper flex, and honestly I'm more inclined to think that all the curmudgeonly resistance I see in here to the inevitable march of progress instead reflects old dogs unable to adequately learn new tricks (like how to effectively prompt it).
This is one of my pet peeves in this field. People create whole programming languages that are "expressive", just to save typing a few dozen characters and have huge tirades against "verbose" languages that require typing a bunch of boilerplate.
If typing the code is the bit that takes the longest for you in a project, stop and take a good look in the mirror. There's something else wrong in the process.
To be specific, the FAQ states: "It has been trained on natural language text and source code from publicly available sources, including code in public repositories on GitHub."
Some have raised concerns that Copilot violates at least the spirit of many open source licenses, laundering otherwise unusable code by sprinkling magic AI dust... most likely leaving the Copilot user responsible for copyright infringement.
AI is just recomposition of existing snippets of code, art, text, music, etc. Does an AI fall under fair use? What happens when an AI produces something too similar to an existing work or trademark. I know the computer won't get sued, the owner/user will. But still, it's a hard problem.
Even if Copilot was initialized with snippets from Open Source Software (exclusively), it doesn't mean that copyright infringement isn't a concern.
When I'm in the flow, trying to solve some algorithmic problem, I always turn it off because the BS suggestions coming from its little "mind" actually slow me down and mess with my focus. Which all makes sense when you realize what it ultimately is - a philosopher, as opposed to a mathematician.
Most projects are 90% BS glue code and 10% actually interesting code. I don't mind only having help with the 90%.
Yeah, this feels like the same nonsense that scientific journal publishers pull. If your product only has value because of what we made, it's completely unfair to not pay us for our work and then to turn around and charge us to use the output.
https://www.infoworld.com/article/3627319/github-copilot-is-...
Sure it has: Time.
In terms of economics it's really simple: Does Copilot free up more than 10$ worth of your time per month? If the product works at all as I understand it (I haven't tried), the answer should be a resounding "yes, and then some" for pretty much any SE, given the current market rates. If the answer is no (for example because it produces too many bad suggestions which break your flow), the product simply doesn't work.
There might be other reasons for you not to use it. Ego could be one. Your call.
> Also feels kind of icky to train on open source projects and then charge for the output.
I don't know why it would feel any more icky than making money off of open source in other ways.
It's quite nice not to have to type generic boilerplate in sometimes I guess but it's very frustrating when it generates junk.
For me, this entirely comes down to the philosophy of how a deep learning model should be described. On the one hand, the training and usage could be thought of as separate steps. Copyrighted material goes into training the model, and when used it creates text from a prompt. This is akin to a human hearing many examples of jazz, then composing their own song, where the new composition is independent of the previous works. On the other hand, the training and usage could be thought of as a single step that happens to have caching for performance. Copyrighted material and a prompt both exist as inputs, and the output derives from both. This is akin to a photocopier, with some distortion applied.
The key question is whether the output of Copilot are derivative works of the training data, which as far as I know is entirely up in the air and has no court precedent in either direction. I'd lean toward them being derivative works, because the model can output verbatim copies of the training data. (E.g. Outputting the exact code with identical comments to Quake's inverse sqrt function, prior to having that output be patched out.)
Getting back to the use of open source, if the output of Copilot derives from its training data in a legal sense, then any use of Copilot to produce non-open-source code is a violation of every open-source licensed work in its training data.
And my problem is : Time.
Cycling through false positives and trying to figure out if it's right costs me way more than $10 a month in productivity.
I cant wait for better versions to come out, but right now, no.
$100/year is a steal for the amount of tedious code copilot helps me with on a daily basis.
What I pity however is that there's no free tier for hobbyists as paying a 10 usd monthly subscription wont make sense when you only code occasionally. For professionals using it everyday, 10 usd / month is inconsequential.
I don't think that would have costed them much more to offer a free allowance to cover say an average coding session of 8 hours per month.
"open source is great, except when it's used in a way I don't like"
This is particularly the case when we see the emergence of new technologies that use it in different ways. Different people may have a wide variety of equally valid views about how it is incorporated into that system.
There's nothing inconsistent, confusing, or complex about those views.
You wouldn't have an issue with someone making money by using open source software (like a website that is hosted on a server running linux).
It's been really nice for autofilling console logs and boilerplate code...but $10? It's a novelty that is nice when it works, but that's a steep price point for what it is, and I don't see that changing any time soon.
The business model for most of the Internet is to bait people into using things for free and then monetize them without compensation in some roundabout way.
How would you feel if they just provided the software without the model, assuming you could train it yourself on open-source code in an instant?
I just don't like the idea of taking people's work (without asking or checking licenses) and then selling it back to them. It'd be like if Stack Overflow decided to start charging to see answers and not asking or giving a split to the person who gave the answer. I realize they aren't just copy/pasting so not a perfect parallel, but still.
I started a project that currently has 9.4k stars (now mostly maintained by someone else), and still maintain a project that has 2.5k stars.
[1]: https://github.com/andrewmcwattersandco/github-statistics
[2]: https://docs.google.com/spreadsheets/d/1HBSwxr0jkUoMulQxyVTC...
https://github.com/pricing#i-work-on-open-source-projects-ca...
I like how "open source project" == "on github". Can't say that I am surprised though.
If anyone knows why pls let me know
The FAQ [0] says
> A maintainer of a popular open source project is defined as someone who has write or admin access to one or more of the *most popular open source projects* on GitHub
(emphasis added)
[0] https://github.com/pricing#i-work-on-open-source-projects-ca...
I find that if my brain is willing to be distracted, it’s made some sort of calculation saying that the cost of being distracted isn’t going to have a significant bearing on deliverables…and I’m pretty sure I’m no better or worse at estimations than anyone else.
If you don't code much, and CoPilot only helps with with 5% of that small amount of code, that makes it sound not at all useful.
It falls apart when writing actual code that exists in an app. I’m not convinced even the lowest junior dev could get away with not knowing programming.
> That corresponds to one recitation event every 10 user weeks
> This investigation demonstrates that GitHub Copilot can quote a body of code verbatim, yet it rarely does so, and when it does, it mostly quotes code that everybody quotes, typically at the beginning of a file, as if to break the ice
A year old post now, YMMV.
The example I can remember was Carmack's* quick square root - but I'd probably call that "folk code" given it was passed down/altered before being misattributed to the Quake dev, and appears in hundreds of Github repos (many with permissive licenses like WTFPL, so a well-intentioned human may do the same).
It's not random recomposition, which is worthless. It's useful recomposition, adapted to the request and context. It adds something of its own to the mix.
In addition, the idea of "derived work" in code snippets is, quite frankly, nuts. There is only so many ways to write (let's be generous on the scope of copilot) 25 lines of code to do a very specific thing in a specific language. If you have 1000000 different coders do the job (which we do) you'll have a significant amount of overlap in the resulting code. Nobody is losing sleep because of potential license with this. Because that would be insane.
I have noticed that upholding oss licensing (at least morally) is kind of a table manner on hs. That's fine, but this is some new level of silly.
It's also not gonna persist, because no matter how much we love our oss white-knightedness, we love having well paying jobs more.
It helps solve the boring simple shit so I can focus on the interesting bit.
Yea, that makes sense, I agree with that. If your use case is skewed more towards "BS glue code" as you say, you'll find more use out Copilot. Then $10/month can be fair, cheap even.
When you disable auto complete, it can still be fired via a keyboard shortcut, which is how I use it.
https://github.com/github/copilot-docs/blob/main/docs/jetbra...
"editor.inlineSuggest.enabled": false
https://stackoverflow.com/a/71224912/1048433Agree completely. It’s still a fact that GitHub’s Terms provide a separate license for Github. See section D.4 https://docs.github.com/en/site-policy/github-terms/github-t...
> GitHub TOS can only possibly give them authorization from those directly uploading code to GitHub. They don’t give github any additional license for code that was uploaded by a someone else (for example a mirror bot) becose that someone does not have the rights to give out that license.
GitHub’s terms already require that uploaders have copyright authorization, or that machine uploaders are doing automated tasks exclusively. Letting a machine upload someone else’s new and copyright code that the account owner doesn’t have copyrights to appears to violate GitHub’s terms. “the owner of the Account is ultimately responsible for the machine's actions” https://docs.github.com/en/site-policy/github-terms/github-t...
> even if they write in the TOS that they can do whatever they want it does not mean that they can actually legally do whatever they want
Yes, correct, I completely agree. Asking for people to agree to ‘indemnify and hold harmless’ over the site’s features is standard language and not really a stretch, it doesn’t amount to Github doing whatever they want. That language is already in the terms.
What is the summary of your comment, what are you trying to say at a high level? We can debate the fine points, but if you are trying to say that my speculative suggestion was crappy and Github has other ways to make copilot legal, then I agree. If you’re trying to say that Github has no legal way to make copilot available, then I disagree.
There is already language in the terms that might cover copilot. Section D.4 I linked to above includes this:
“This license [that you grant us] includes the right to do things like copy it [your content] to our database and make backups; show it to you and other users; parse it into a search index or otherwise analyze it on our servers; share it with other users; and perform it, in case Your Content is something like music or video.“
The Copilot FAQ also mentions they have IP filters and actively prevent reciting large portions of anyone’s code, it explicitly mentions a threshold of 150 characters.
Hosting code "for free" is part of its business model. It's not a way of "giving back"
Github Advanced Security costs me $200+/month, but if I make my repository public, I get all of that for free.
Certainly feels like a way of giving back.
Of course it benefits them too, but it doesn’t have to be purely altruistic to be a net positive.
They could even change github so it benefits their code crawling.
Boring and repetitive but of infinite value if you can detect early that your deployment broke something.
If a boring/boilerplate unit test like this can deliver value, other b/b unit tests are probably going to similar impacts and hence, saying they all “decrease” value is reductionist.
Yeah, and it should come almost automatically from the tooling. If you are writting this, you have a tooling problem.
No. They're not. They're advertising that they are.
They are providing it to a very small set of high-profile OSS maintainers some opaque algorithm picked out. Having high-profile adopters is just good business.
One fundamental aspect of being open source is not limiting the purposes of use. If we now say that "code-generation AI training" is not allowed without prior approval (in addition to the license itself), then it's not open source anymore...
I'm not defending Microsoft's market tactics, for obvious reasons, but we do have to consider that anyone can publish whatever insignificant code as OSS and become an "OSS maintainer" out of nothing.
They have to draw the line somewhere. Nowhere they draw will make everyone happy.
By "better", I mean more absurd, shocking and funny :)
However it's even more expensive than copilot...
But slowly enough many jobs are being automated, both with and without machine learning or whatever technique they are calling "AI" today.
If liability sits somewhere, it's with copilot, github, and Microsoft.
A lot of that might come down to bedside manner. Right now, github isn't super-polite to people whose code it used. That's probably a mistake. They'd be a unsympathetic evil megacorp in a jury trial.
Been a hell of a decade, hasn't it.
The wisdom of crowds works best when:
1. participants are independent (otherwise you may get failure modes, such as "groupthink" or "information cascades")
2. participants are informed, but in different ways, with different opinions;
3. there is a clear, accepted aggregation mechanism, where individual errors "cancel out" to some degree
I view the topics in James Surowiecki's book (or the Wikipedia summary of it, at least) as required thinkinpg for everyone, preferably synthesized with a study of statistics and political economy.
In particular, the Wikipedia article's section on "Five elements required to form a wise crowd" is a slightly different slicing of the required elements that I offer above.
* If you read that section, trust is listed. I, however, don't see trust as a necessary condition for a "wise crowd". Trust is often useful (or even necessary) when a collective decision is used for governance, decision-making, and policy.
When I go to https://github.com/github-copilot/free_signup it says:
" Congratulations! You are eligible to use GitHub Copilot for free.
Thanks for being a part of our open source and education communities. GitHub Copilot uses the Codex AI model to offer coding suggestions."
I have a project with about 3k stars, and regularly contribute to another project ~4k stars (Where I'm also the primary maintainer, although it's not on my account), as well as some things in with hundred and dozens of stars.
I don't how high up that is in the ranking, although given that most projects get 0 stars I suspect it's probably higher than you'd might expect.
I think it really depends on what languages you use though. If you use something like Kotlin where there's really almost no boilerplate and the type system is usefully strong, the symbolic logic auto-completion is just far more reliable and helpful. If you're stuck in a language where there's no types, and there's lots of boilerplate to write, then I can see it may be more helpful.
"Intent is not relevant to copyright infringement liability."
"But your honor, I heard on Hacker News that it was."
"I find you guilty."
"But your honor, copyright violation is usually a civil issue, and 'guilty' is a criminal trial concept."
"Well, I also get my legal training from Hacker News."
E.g., if I take that Disney movie, incorporate it into my own movie, and distribute it, then I'm also violating copyright.
And you might argue that Copilot is also a distributor.
If you trace a picture and use it in your work of art, does the copyright of the original picture no longer apply?
If you copy a tune but set it to new instruments, does the copyright of the original tune no longer apply?
Sampling is a legal minefield in music, why would it become less of a minefield in code just because you've automated it? So far the best attempt at an answer about the legal issue of Copilot I've seen was that it's "not technically violating copyright", which honestly is not very reassuring and extremely morally inconsistent for a company built by a guy[0] who is philosophically invested enough in intellectual property as the pillar of human society to write An Open Letter To Hobbyists and use his Foundation to convince entire governments of adhering to IP laws instead of allowing the mass production of vaccines and medicine.
[0]: Yeah, I know that he no longer serves an active role in the company but this was very much a founding ethos and this is at least a fair bit hypocritical.
Copilot isn't sampling. Sampling is literally copying snippets of someone else's music and putting it into your music. Copilot doesn't do that. There's no giant database of text that it just slurps suggestions out of.
Just seeing someone else's code is hazardous from a legal precident point of view
Also - there is logic in copilot that checks to make sure it is not suggesting exact duplicates of code from its training set, and if it does, it never sends them to the user.
I wouldn't put any hard rules on it, but it does seem very fair for programmers who have learned a lot from GPL code to contribute back to GPL projects. I have learned from and used a lot of open source software so whenever possible I try to make projects available to learn from or use.
Why is it different if we slap a "ml" lable on it
There's a limit to what individuals are willing to pay for a subscription service irrespective of how many hours it saves you. Now if we're talking enterprise and bulk licensing then that's a separate issue.
You also have to (slightly) change your flow to get the most out of it, which I know is a deal breaker for many.
I absolutely love it. It's not going to write good code for you, but for an autocompleter it is amazing.
It might be possible, I don’t know about “highly”. Have you checked the license exclusions required to use Github? Their terms already carve out a Copyright exception for Github, because they need it on order to host your code. There’s also no reason Github can’t filter certain licenses, or make it impossible to complete entire functions, or build an option for everyone to opt-in to being autocomplete source material regardless of license, right? Any legal challenges are likely to result in changes to the feature before there are ever any serious repercussions.
I think it’s at least as likely, if not more so, that Copyright Law could evolve in response to the growing number of AI auto completers, and we (society) try to allow it within reason by being more specific about what constitutes automated infringement and who’s responsible for it. Fair Use currently exists but is vague and left up to courts to decide. In the meantime, Copyright is primarily intended to foster a balance between business and freedom of expression, and there’s a lot of open source software on Github that cares about freedom of expression and not about business. In any case, we don’t really want Copyright to represent some kind of absolute ownership land-lock over every string of 100 characters, that is a bit antithetical to both Copyright and the FOSS community.
Triply so when Microsoft is involved.
As far as copilot goes, yes it’s possible to get it to recite copyrighted works, but in normal usage it is creating independent works because it is too influenced by the structure of your code around the insertion point to recite anything. It’s auto completing things like the variable names that you already declared, simple loops and function applications, etc.
> What that means legally has yet to be fully determined.
At least in the US, the Supreme Court ruled in Google v Oracle that the entire Java API is not copyrightable. Copilot users are very far from crossing the line, the courts are not going to come after some de minimis 10-line snippet that copilot generated.
Whether Microsoft itself was legally in the right by training copilot is a more interesting legal question that remains unresolved.
Of course, copilot is only going to save you typing time, and you'll have to pay it back at reading time.
An example here is an infinite list. It's far easier to do that in say Haskell and python, applying filters along the way using higher order and curried functions than it is to do it in say C.
Yes, you can shove a 20 line function into one line with a ton of weird symbols, but the one reading it after you will need to unroll it anyway to understand it so what's the point?
That will destroy the propaganda that the Law is there to protect content creators better than anything the people against copyrights can come-up with.
100% of what I do is open source. It's used by millions.
It's free for maintainers of "major" open source projects. I'm not sure what a "major" open source project is, but it's clearly not what I do. The only way to know if your open source project qualifies is to try to sign up. If it does, you're given a free option.
I am the primary author (but not current maintainer) of an open-source project which is reported to be used by over 100 million people, according to (flaky) statistics kept by the current maintainers. That's around 1% of the people in the world.
I don't trust the current maintainers to be honest with numbers (there are lots of ways to estimate numbers of users), but it's definitely in the millions, and it's a project you (and most random people you'll meet in tech, and many outside of tech) will have heard of.
I am currently working on earlier-stage projects, which have smaller communities, but 100% of them are open-source.
Goal of library/language/framework designers: limit boilerplate and unnecessary code
Goal of tool/IDE designers: make it easy to not spend time on boilerplate and unnecessary code.
Both sides have built in limitations that will keep them from completely solving the problem. Terse, highly DRY code tends to also be highly abstracted and hard to read, with a lot of implicit behavior. On the other side, large amounts of generated code lead to tool lockin and cruft accumulation.
If co-pilot’s autogenerated test cases can help prevent this head smacking, it will have proved that basic/boilerplate code was valuable.
The site checker included in the tooling is just a more mature version of the boilerplate unit test co-pilot gives us.
Btw, I have no skin in the game -never used co-pilot…just surprised that HN commenters can be so dismissive of wanting to get the basics right - like having some test coverage.
I think there might be some in this thread who don't consider these derivatives, for whatever reason, but it seems to be that if rangeCheck() passes de minimis, then the output from Copilot almost certainly does, too. That a tool is doing the copying and mutating, as opposed to a human, seems immaterial to it all. (Now, I don't know that I agree with rangeCheck() not being de minimis … and yet.) Or they think that Copilot is "thinking", which, ha, no.
> We will both continue to work on decreasing rates of recitation, as well as making its detection more precise.
But it probably won't be worth millions of dollars. And that is why the lawsuit wont be worth it.
> That piece of knowledge may have taken a LOT of effort from an OSS team to acquire.
Anything "may" be possible. But it probably won't be worth that much.
I'd suggest to get more information about the repercussions associated with appropriating GPL code into proprietary closed source.
This is a big deal. You may have to license your entire codebase under GPL if you incorporate GPL code and distribute it.
I would suggest that you actually take your own advice and get more information yourself.
No license can force you to release your code. Nope, not even GPL.
Instead, what a rights holder can do, is sue for damages for the copyright theft, for not following the license. They can't force you to follow the license. Instead, they can say that you didn't follow it, therefore you stole the code, and owe money to them, for stealing the code, depending on how much the code is worth.
The only thing that GPL does, is it gives people permission to use the works, in exchange for releasing code. But, if you infringe, the damages do not depend on whatever the license was, or whatever request the license makes.
To use an example someone else gave, of the "first born child" license, imagine someone writes a simple binary search function, and puts out a license that gives it out for free, in exchange for paying them some absurd price. EX: the joke of the first born child, but more seriously, lets say the license was "1 million dollars".
If someone stole that binary search, couple line function code, and it went to court, they absolutely would not own them 1 million dollars, even though thats what the license said.
Instead, they would owe the rights holders damages. And chances are, a couple line binary search function, or some other example that you could think of, would only be worth a small amount.
And even though the license said "This code is worth 1 million dollars, and you owe us that money if you use it!", it is not true that anyone would owe them a million dollars. Instead they would only owe them damages, which would not be anywhere close to 1 million dollars.
No one has won billions of dollars on GPL enforcement. It's not how courts work. Contrary to popular belief, courts also won't compel compliance (e.g. releasing my code); if I break your license, the standard recourse is damages, whether that's GPL or All Rights Reserved.
Otherwise, I'd make the First Born Child license, whereby by using my code, you give me full ownership of your first born child, your home, your car, and your bank account. I could write a license like that right now, but I couldn't force you to give me your child, car, bank account, and home. If you used my code, you'd have the option to accept the license and give me those things. Or you could reject it, in which case, it's a normal copyright violation; in that case, whatever I wrote in the license is moot, and you pay damages (and stop using my code).
The only part which wouldn't be valid in a contract was the first-born child. That was a joke.
Indeed, if the GPL were a contract, courts might compel compliance.
However, the GPL is not a contract, it's a license. The FSF bent over backwards to make sure the GPL/AGPL licenses wouldn't be viewed as a contract, in part to limit liability / damages / risk.
Confusingly, some EULAs are framed contracts, contrary to the acronym, and do expose users to much more risk of liability than the GPL.
The relevant part of the GPL is:
You are not required to accept this License in order to receive or
run a copy of the Program. Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission to
receive a copy likewise does not require acceptance. However, nothing
other than this License grants you permission to propagate or modify
any covered work. These actions infringe copyright if you do not accept
this License. Therefore, by modifying or propagating a covered work, you
indicate your acceptance of this License to do so.
Although we often like to take a plain-text read, but that's misleading; this is legal jargon. It's one of those bits of text which needs to be explained by a lawyer, and one who specializes in both licensing and in contract law.My jurisdiction has no concept of fair use.
Please remember this from the guidelines
> Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith.
I don't think that's true.
When the parent comment made that observation, they attached the caveat they might not be as skilled as others. They were already fully aware their potential lack of skill might affect their opinion of the product. All you did was repeat that same claim back to them, as if they weren't already aware of it which is a pretty uncharitable interpretation. A steelman interpretation that you could've said would assume there are some low-hanging fruit new or inexperienced developers would benefit from greatly (not just typing as you suggest), but once you develop a certain level of skill, Copilot would become less useful for experts such as yourself.
If anything, you didn't respond to the strongest plausible interpretation of what was said, since you willfully disregarded their own insight into the problem.
Then to try and morally lecture someone on their behavior by applying a rule you don't even hold yourself standard to is pretty astonishing.
I find you having to write 3 long paragraphs of bullshit about that to be the truly astonishing thing here. Holy crap.
Or the windows xp leak and how that is a mess for wine/proton/reactos devs
There is no concept of fair use in code copyright
Copilot is 100% a ask for forgiveness later sort of project
Some companies seem to be leaning into higher subscription pricing (Superhuman and Motion come to mind) and almost certainly produce far more value than their subscriptions cost if you ask me, but there's definitely a mental barrier to value based pricing to consumers, as well as the fact that with so many companies offering cheap/free software, the market isn't solely determined by value created but rather comparison against other software.
Also, that little "building confidence in yourself" rider that you added suggests that you think the OP doesn't have confidence in themselves. Careful about those assumptions; in this case it comes across as a little patronizing.
They certainly do have confidence, but it doesn't hurt building upon it. I don't think confidence is a discrete 0 or 1 variable. It's a continuous variable, from 0 to infinity.
By the way, I made a question, not a statement. I welcome the argument on whether it's worth the time to memorize the stuff Autopilot can autocomplete. That's something to measure and debate.
> Careful about those assumptions; in this case it comes across as a little patronizing.
Thanks for the heads up. This is important. But I didn't make any assumption. It seems maybe you made an assumption about me making an assumption?
If that's the case, don't worry as I did not feel patronized.
It's not a silver bullet / holy grail / MacGuffin / etc.
There's a community of people obsessed with spaced repetition. None of them seem to have accomplished spectacular feats of learning. There's a good reason for that.
(The flip side, however, is that many people who have accomplished spectacular feats of learning DO often use spaced repetition, but among a broader repertoire).
> None of them seem to have accomplished spectacular feats of learning
> many people who have accomplished spectacular feats of learning DO often use spaced repetition
---
Second note:
> Spaced repetition is a good tool for learning vocabulary.
This might be a misconception? The first ever published paper about spaced repetition was about vocabulary.
But research has shown its merits are valid for many other types of knowledge.
By the way, memorizing coding vocabulary is arguably similar to human language vocabulary.
It's not contradictory.
I'll give an example: Landmines are a good tool for slowing an enemy army. However, if your military consists of *just* landmines, it won't be very effective. That doesn't make landmines a bad tool. Indeed, even a super-weapon, like the first jet fighter, won't win a war if it's your *only* tool.
Learning -- even a language -- is a complex process, and you need many tools. Spaced repetition is awesome for factoids. If you want to learn a language, you need to memorize vocabulary. SR is great for that. If you add 5-15 minutes of spaced repetition to a good language program, it will help a lot. If that's where you spend a majority of your time, you'll learn very little. However, SR won't help you practice a broad range of skills around listening, speaking, understanding communication styles, or quite a few other things.
Ditto with physics and math. If you know equations, it accelerates everything else. However, the bulk of the knowledge isn't factual or procedural; it's conceptual. Simply memorizing formulas won't help you. On the other hand, in most cases, once you learn conceptual knowledge in physics, you never forget it.
"Coding vocabulary" isn't in my top-50 problem with junior developers. Naming variables, algorithms, systems design, etc. are. Most of those don't align to SR. I'd take a programmer who spends 8 hours coding over one who spends 8 hours memorizing library calls in an SR system.
Note 2:
The spacing effect is /somewhat/ broadly applicable (but far from universal), but spaced repetition specifically is only helpful for factoid-style knowledge. You can look over the different classifications of knowledge, skills, and abilities (factual, conceptual, procedural, declarative, etc.).
It’s not always right but it’s right enough it’s a big timesaver.
That assumes I once had the knowledge of what is "correct".
I just don't quite remember it right now. Then I rely on Autopilot to complete it.
Sometimes I may not feel sure enough to judge whether Copilot is right. I'll need to dig the documentation anyways.
Other times I'll feel sure. But in how many of those I'll be wrong? And what will be the consequences?
> what will be the consequences?
for me, usually a failed build or unit test :P low stakes stuff
If I write AGPL code, and co-pilot scans it and makes a very similar program to it for a FAANG, who then proceeds to compete with my open-source tool by using the creative ideas generated there-in, but with a proprietary tool, that's not very fair. That's why I chose the license I did.
FAANG is more than welcome (indeed, encouraged) to use my code for any purpose permitted under the license. That includes everything except making it proprietary.
I've tried running copilot with the starting lines of my code. It generated code with identical creative ideas. It was the equivalent of taking Star Trek, and generating a new movie with the same plot line, but with names changed. That's not legal.
My code was specific enough that this wasn't just chance or other similar code. I work in a pretty narrow domain.
I did use copilot for coding myself, and a lot of what it generated was unique. But it is also a good paraphrasing tool. Running a movie script backwards and forwards through Google Translate to get different phrasing, and then swapping out new names, does not a new movie make. Ditto here.
You would expect that from a program that copies a database of all the examples in the world (or whatever) and then just does an autocomplete without any kind of comprehension of what the problem is that is trying to be solved.
No confusion or contradiction at all.
> it produces code that is correct in some circumstances, but is incorrect for the author's use case.
That’s a mighty convoluted way of saying “incorrect code” ;)
Phew! I feel better, now!
It will be a gift. Gifts are valid, but they require free will of the gifting party. Gifts, without free will, can be easily canceled by court.
Almost anybody doing software development at a well-funded tech company is going to be making over $120k/yr, yes. But it turns out there are lots of other kinds of programming jobs, too.
In isolation, most developers could easily afford the $10/month for copilot. But most developers are probably using the free tier for half a dozen services. So the question isn't "Can I afford copilot?", but rather "Does copilot provide more value than upgrading plans on some other service?". For example, if you are using the free tier on Slack, maybe upgrading to the paid tier so you can access the full chat history provides way more value than copilot.
Also, another consideration is that $10 per month is certainly small. But I generally use software I purchase for multiple years. I would guess on average I use a piece of software for 3-5 years. If Copilot was offered for a single purchase price of $300-500, would you pay for it? Because that is likely how much you will spend over the lifetime of the subscription. For me, that price point is approaching the territory of professional tools like CAD software, Photo/video editing software, etc...
I can certainly see why Copilot would be worth $10/month. But I also could see why someone might be uncomfortable with that.
Can you name most useful ones? So far my only subscription is Idea. I'm considering to try Copilot as I've heard many good things about it.
Saving time with Copilot is itself a learning process and a probabilistic affair. Copilot can win you a few seconds at a time, but can easily set you back minutes if you aren't careful or experienced. It's the probability of a downward spike in time-win that makes it such a gamble. Such complex deals just turns on the cautious side of my brain.
Also, if I magically knew that I could save you 3 hours yearly, but it were spread out over the course of a year, and that your savings would occasionally spike down into negative and then slowly climb up, I just wouldn't entertain such a complex offer at such low numbers. People pay insurance just to avoid such incidental downward spikes.
Copilot's biggest limitation right now is that you can't dare to allow minutes of savings per day without inviting the risk of a severe spike in debugging time, the kind that wipes out all your savings. This means you cannot spike up.
- Money is worth more to me than the average US dev because I earn less than US developers, and therefore my time is definitely worth less.
- I cannot use this for work at my current workplace and I'm willing to bet a lot of other companies aren't fine with it either. I'm not saving time where it makes me money, so I would classify as a luxury, not a tool (spending-wise).
If it has a tangible ROI, then I figure out how much my time is worth, I figure out how much time or other resource the SaaS app will save and then decide if it's worth the tradeoff. For example, I suck at graphic design, so a monthly $13/mo to Canva is worth it to me to save time, aggravation, and headache, not to mention improved quality of results. I know that I save myself much more in time than the $13/mo is worth.
On the otherhand, I can't justify paying even $15/mo for a podcast transcription tool because I still have to spend dozens of hours checking the transcription and it doesn't save me any headache. So it's not worth it to me. It doesn't matter if it's $60/yr or $100/yr, my time is still worth the same. If it's not worth it at $60/yr , it's not worth it at $100/yr.
Maybe this thought process is different for others, but with so much SaaS out there, it's important to focus on what will drive high value. Incremental "auto-yes" spending at any price point can get you into trouble.
It is possible to provide CoPilot with a sequence of inputs that produces some of the input, which was copyrighted. Let's say you want to help people violate copyright, so you as a third party distribute a script that provides that sequence of inputs. Who's violating the copyright there?
Alternatively -- it is apparently legal to produce a clean-room implementation that duplicates a copyright implementation. Supposing you were to use a tool like CoPilot, which has just been trained on that copyright implementation. Is your room still clean? You might even be able to get it to spit out identical functions!
Or, if you have a ML algorithm which has been trained on leaked closed source code, and it is sufficiently over-fitted as to just provide the source code given the filename or the original binary, who is violating copyright when this tool is used? If it is just the end user, then this seems like a really convenient way to launder leaked closed source code.
If I induce you to break a contract with someone else they can come after me for damages.
For example in this case, there are developers who have created GPL code. That code was licensed to some other developer. Github then encouraged people to upload git copies of the GPL code onto github where it was put into the model. That model contains the copyrighted materials and isn't coming with the necessary notices. The output of the model can be code that is a direct stand in for the copyrighted work. Thus Github have become a party to breaking the license even though they themselves never agreed to the GPL.
In addition Github are encouraging (They are advertising it and making it available broadly) other developers to copy that code and use it in their project. Again that's encouraging an action that breaks a contract. Github is well aware that this is likely happening and they continue on. Thus they might be liable. You also might be liable.
All of these things can and likely will be argued before courts but it's not at all one sided.
> That's been established in court with regards to training AIs.
What are you basing the certainty of this statement on? The case law I have seen around this is pretty spotty. Cases around training on copyrighted materials have predominately been about the input, and not the output. With the final output usually being controlled by the model owner. For example Google obtained the books they scanned legally then used them to produce google books' index. There are some major differences.
- The books were purchased, meaning they got a license to use the book. There's for sure code in the model that Github does not legally have the right to use. They are aware of this. Making the input more shaky for github. - Github is making a direct profit off of this service. It's a revenue generating enterprise. That's important since it raises the bar of what they can be expected to do.
There's been nothing that goes to the supreme court yet; it's all per circuit and not settled case law. Also this gets WAAAAY more complex when we start talking about outside of the US and isn't decided at all.
These things are complex and likely you need your lawyer to advise you with any real questions.
I should have put "reuse" in quotes, since I meant copilot takes reuse one step further and replicates or regurgites code.
What I wrote applies. An argument could be yours is diff because of specifying it is bullshit. Yet that’s what most responses are about. A mix of statements like:
“why do you care so much”/“lol you care so much” or “wow you really wrote that much” or “your long writing is all BS and excuses”.
I don’t want to attack you as a person or make you feel bad. No insinuation or implication. I am directly saying stuff. If I did read something that’s not there, I’m up for being shown how I am wrong.
Also gives you a fixed URL and is free, and there are quite a few other free tools out there.
To be clear: we’re in agreement that incorrect code that passes for correct at a glance is even worse than obviously-incorrect code.
In most cases, damages are set to make both parties straight, not to be punitive. People cite how trillion dollar companies might have billion-dollar lawsuits, but that's pretty reasonable. $1B damages are 0.1% of a company's value in a battle between FAANGs, which have big-O trillion-dollar valuations. If you have a dispute between $1M businesses, the analogue is $1k damages. That's not atypical for a commercial dispute.
I did not say it forces you to distribute it. That's absurd.
What I said is: "if you incorporate GPL code and distribute it"
If you do those two things, yes, you have to license your code under GPL.
It's not me saying, please take a look at Section 5-b and 5-c of the license. [1]
Let's do an experiment: You need to hit yourself repeatedly in the head with a mallet until you pass out.
Are you currently hitting yourself with a mallet until you pass out? No. Just because something is written doesn't mean you need to do it. If I incorporate your GPL code, distribute it, and don't license my code under the GPL, that means I'm distributing code without a license (or breaking a license). Unless I've crossed the line for criminal prosecution (which is far from anything we're discussing here), the worst-case consequence of that is .... damages.
If I've crossed the line into criminal prosecution, then the consequence is damages and jail time. I absolutely STILL do not need to license my code under the GPL.
(In most cases, it's a good idea to license code under the GPL, though, both due to branding/reputation damage, and since usually that leads to an out-of-court settlement; but those carry no legal force being that)
This is not how the law works. In addition to damages, if you're a party to a civil lawsuit then a court can order you to do something. This is called an "injunction".
For example, if I write something and you start selling copies of it without permission, and I sue you over your copyright infringement, a court can and will order you to stop. Copyright has teeth like that.
If the thing you were selling was your product -- based illegally on my GPL'd code -- then that may be a lot worse for you than some damages.
If not, then you do have to license your entire work under GPL if you incorporate GPL code and distribute it.
If yes, what kind of environment do you think you're promoting? Is it positive for the development of the industry, and to society in general?
No, it doesn't. I didn't write the type of come-back you're arguing against. I didn't write "lol you care so much" or anything to that effect. You're just making up straw men.
DantesKite wrote "I don't think that's true." when I explained what I meant about a comment, and then spent three paragraphs twisting my words and making up a story about what I wrote. Which is dishonest, I know what I meant better than they. Me pointing out that they spent three paragraphs arguing in bad faith isn't me saying what you're accusing me of.
This may be a bit nit-picky, but I don't think that is correct.
Most books I've seen don't say anything about granting a license so there would be no explicit license that comes with them.
Maybe you could find an implicit license if normal use of a book required a license but it does not. Copyright law allows all the normal uses of a book without requiring permission of the copyright owner. You only need a license when you want to do something that requires permission.
I was saying that there's some implied license after first purchase. I believe that was part of the court's decision. Paying for a book (or a library paying) gives you implicit rights to fair use. Github's copies of code were not purchased. They were given by sometimes third party.
So there's likely some room to argue that fair use rights are different enough between previous cases and github.
The solution to that is to remove or replace those lines.
That's not worse than damages. That's just table stakes. That's expected no matter what happens. If I had a few lines of GPL code in a proprietary code base, I'd do that the day it was discovered.
To understand the frequency of injunctions, have a look at this test:
https://en.wikipedia.org/wiki/Injunction#Permanent_injunctio...
Injunctions generally only happens if other means (like damages) have been exhausted.
"can" is a complex question. You can do anything you want, but actions have consequences. I can buy a gun and shoot someone. The consequence is that I might spend the rest of my life in prison. I can fart in a crowded elevator. The consequence is that people will look at me funny, and might dislike me.
Consequences should be proportional to the action.
If farting in an elevator lead to life in prison, or if shooting someone led to people looking at me funny, things wouldn't work very well.
> If not, then you do have to license your entire work under GPL if you incorporate GPL code and distribute it.
No. This is not a proportional consequence. If a random developer incorporates 10 lines of GPL code into Windows, Microsoft doesn't need to license Windows under the AGPL. That's not how our legal system is set up.
Microsoft has to remove the code and pay damages.
> If yes, what kind of environment do you think you're promoting? Is it positive for the development of the industry, and to society in general?
The logic you're suggesting -- is not only incorrect -- but would lead to an environment where people have an irrational fear of "viral" licenses. They're intentionally not viral. They don't infect code. Releasing your code is one option for remedy, but not one the GPL author can force. The FSF went over backwards to design the license like that.
Damages and removing code is an appropriate consequence. It's adequate to prevent most license violations, and still not overly draconian. I don't know of any business which has gone under due to an error around the GPL. That's as it should be. If the GPL were business-toxic, it wouldn't set up a successful ecosystem.
Think of it: If Nevada gave the death penalty for littering, would you liter less? Or simply never, ever, ever travel to Nevada?
In this case, I don't know of a reasonable remedy. I don't want to shut down copilot, but I do feel bad about having my code stolen from me. Perpetual license for everyone whose code was used to develop co-pilot? A nominal stock grant in Open AI? I dunno. When I've seen class action lawsuits, those are the sorts of places things usually land. Indeed, it's usually just short of being fair.
What I want is a copilot that finds errors ala spellcheck-esque. Did I miss an early return? For example in the code below
def some_worker
if disabled_via_feature_flag
logger.info("skipping some_worker")
some_potentially_hazardous_method_call()
Right after the logger call I missed a return. A copilot could easily catch this. Invert the relationship. I don't need some boilerplate generator, I need a nitpicker that's smarter than a linter. I'm the smart thinker with a biological brain that is inattentive at times. Why is the computer trying to code and leaving mistake catching to me? It's backwards.Hmmmm, that is actually a good observation.
The main problem with that is, GPT-3 can't do that. Personally, while I sing the praises of GPT as a technology, and I do mean it, at the same time... it's actually not a very useful primitive to build further technology on. The question "if you were to continue this text, what would you continue it with?" is hard to build much more than what you see with Copilot. Without a concept of "why are you continuing it with that?" (which, in some sense, the neural net can answer, but the answer exists in a way that humans can not understand and there is no apparent practical way to convert that into something humans can understand).
So GPT-x may yet advance and is fascinating technology, but at the same time, in a lot of ways it's just not that useful.
It reminds me of the video game world, where we have just staggeringly unbelievable graphics technology, and everything else lags behind this spike. Being visual creatures, it causes us to badly overestimate what's actually going on in there. Similarly, it's great that AI has these talkerbots, but they've made a whole lot of progress on something that gives a good appearance, but doesn't necessarily represent the state of the art anywhere else. This AI branch of tech is a huge spike ahead of everything else. But it's not clear to me this technology is anything but a dead end, in the end, because it's just so hard to use it for anything truly useful.
No-code, visual programming, gherkin, even SQL are all prior attempts at reducing the expense of software development, and of sidestepping the expensive, excuse laden gatekeepers that are software developers.
Copilot is an MVP of a technology that will probably eventually succeed in doing this, and my guess is, it's going to make CRUD slinging obsolete very soon.
Copilot is not backwards, it's just that it's a convenience tool for the execution of business, not for software developers.
When version 2 of the tool can both code and error check, hopefully you're already promoted to architect by then...
No way Microsoft made this investment for a measly 10 dollar subscription. There are not that many developers
What problem does the following pseudocode have?
def some_worker
if disabled_via_feature_flag
logger.info("skipping some_worker")
some_potentially_hazardous_method_call()
And receive this response: The problem with this pseudocode is that there is no "end" keyword to close off the "if" statement. This means that the code after " some_potentially_hazardous_method_call()" will always be executed, even if the "disabled_via_feature_flag" condition is true.
And that's with a GPT3 without any special fine tuning. Of course, the name `some_potentially_hazardous_method_call` is pretty leading in itself. I rewrote the prompt slightly more realistically, as: What problem does the following code have?
def optionally_do_work():
if disabling_flag:
logger.info("skipping the work due to flag")
do_work()
and received: The problem is that the code will still try to do the work, even if the flag is set.
This does seem like a pretty trivial easier-than-fizzbuzz question to be asking, though, since it's so encapsulated.The insistence of a lot of smart people on using whitespace for logic purposes is THE most baffling thing in the IT space. And I mean that.
Why use some, oh I dont know, CHARACTER, to write down what you mean, why not instead use a NON CHARACTER. Now that's a great idea!
Let's use non characters, so a mix of tabs and spaces (which editors can implement AND display in a number of different ways, even depending on individual configuration!) fucks up shit. Using whitespace is also great because copy/paste is now an error-prone exercise in frustration, which is definitely what we want! Oh and also this will make sure that the peasants are not using our beautiful language IN NAUGHTY WAYS, e.g. you can't really write a deeply nested logic in a single function if it becomes a complete abomination of a mess just after like two or three indentations.
No, but seriously, Python's syntax in regards to whitespace, or any language that uses whitespace for control structures, is hot garbage. I understand that there are preferences, coding standard, etc. and I can tolerate a lot, but this, this is the one hill I'm willing to die on.
In my opinion, if the function is done its job, it should return. That's what return is for. As the function grows, the else side of the branch gets longer and longer and it is error prone to leave the first branch of the if statement reliant on it.
I dunno about this. I know the received wisdom is that "writing the code isn't the hard part", but I think reality is more like "writing the code is only one of the hard parts". There's an awful lot of badly-written code, or code which is only partly correct, or only correct under some circumstances. The only way to make writing code not one of the hard parts is to specify 100% of the functionality, every corner case, and all test scenarios, before any code is written. And then you still have to verify that it was translated correctly into code, which I think we can all agree is another one of the hard parts!
Conceiving the solution is hard, thinking of edge cases, what-ifs, and failure scenarios is hard, creating effective tests is hard, and writing the actual code understandably and correctly is also hard!
Writing the code isn't the bottleneck. And there is no point in optimizing some part of a process that isn't a bottleneck.
Anyway, have you noticed that "understandably and correctly" isn't included on the OP's definition of "writing code"? That's for a reason, and it's the most adequate definition to use on this context.
1. It has shifted some of the code-writing I do from generation to curation.
Most of the time, I have to make some small change to one of the first options I get. Sometimes I don’t. Sometimes I get some cool idiomatic way of doing something that’s still wrong, but inspires me to write something different than I originally planned. All of these are useful outcomes — and unrelated to whether someone is “actually thinking about what they’re writing”.
2. It has changed my tolerance for writing redundant code, for the better.
Like many programmers, I tend to optimize my code for readability first, and then other things later when I have more information. Sometimes, my desire for readability conflicts with my desire for code that avoids redundancy (e.g., “oh but if I put these three cases into an array I can just use a for loop and don’t have to write out as much code” etc. etc.) — and my old bias was avoiding redundancy more often than not. But copilot is really great at generating code that has redundancy, which has often helped me write more readable code in quite a few cases.
3. I refactor code way more now.
In part this is because, given code that already works but is not ideal (e.g., needs to be broken into more functions, or needs extra context, or some critical piece needs to be abstracted), copilot does a fantastic job at rewriting that code to fit new function prototypes or templates. IDEs can help with this task, for a few common types of refactoring, but copilot is way more flexible and I find myself much more willing to rewrite code because of it.
Copilot is not what many people want it to be, in much the same way that Tesla’s Autopilot is not what many people want it to be. But both do have their uses, and in general those uses fall into the category of “I, as human, get to watch and correct some things instead of having to generate all things.” This can be very useful. (FWIW, it takes some time to adapt to this; I teach and mentor a lot and I found myself relying on those skills a ton when working with copilot.)
We shouldn’t discount this usefulness just because these systems don’t also have other usefulness that we also want!
Here's how it actually works in practice
1. Start a line to do an obvious piece of code that is slightly tedious to write 2. Type 2 characters and Copilot usually guesses what you want based on context. Perhaps this time it's off. 3. No matter, just type another 3 characters or so and Copilot catches up and gives a different suggestion. I just hit "tab" and the line is complete
It really shines in writing boiler plate. I admit that I'm paranoid every time it suggests more than 2 lines so I usually avoid it. But in ~year of using it I've run into Copilot induced headaches twice. Once was in the first week or so of using it. I sweared off of using it for anything more than a line then. Eventually I started to ease up since it was accurate so often and then I learned my second lesson with another mistake. Other than that it's done nothing but save me time. It's also been magnificent for learning new languages. I'll usually look up it's suggestions to understand better but even knowing what to look up is a huge service it provides
since I have a right arm swelled up to twice normal size right now and it hurts to type for more than ten minutes (hopefully ok in a few days) I can imagine an advanced autocomplete being really useful for some disabilities.
And pray tell, how much typing is required to go back and fix the incorrect code produced by copilot?
P.S.: wishing you a speedy recovery!
I figure advanced auto-complete should not produce big blocks of code that are more likely to have logical errors in them, since the grandfather comment here suggested that problems show up when you generate larger blocks of code.
Of course, one would then ask how to verify tests. I suppose Copilot could write meta-tests - tests that verify other tests. That way it could test its own tests and tweak them until they work.
Of course, one would then ask how to verify meta-tests. I suppose Copilot could write meta-meta-tests - tests that verify meta-tests. That way it could test its own meta-tests and tweak them until they work.
Of course, one would then ask how to verify meta-meta-tests...
Sure it can. But you can't rely on them being good. You have to read the tests carefully.
But yeah, the hard part of writing nontrivial software isn't typing code, it's the software architecture and design.
I thought it might be more useful to me for a language I’m already good at, or one I’m not trying to master but just need to get a task done for.
If you look at your typical C-like code, it is already whitespace-oriented, you just manually add the braces as line noise, to make it easier to write a compiler for the language (although even that may not be true). It is like using one character variable names, which - other than the trivial places - makes your code harder to read.
If you want to write deeply nested logic in a function: well, don't. But if you insist, I'm not sure how curly braces help you in this case.