Introducing the Windows Pseudo Console (ConPty)(blogs.msdn.microsoft.com) |
Introducing the Windows Pseudo Console (ConPty)(blogs.msdn.microsoft.com) |
But also it's the fact that three decades of not even life support has left the Windows console in pretty sad shape -- the folks tasked with getting it into better shape were bound to see the value of ptys.
Lastly, don't forget that Windows NT was meant to be a console OS, like VMS. There must still be people, even if very few, at MSFT who appreciate text-oriented apps.
For me, the tty/pty, shells, screen/tmux/..., ssh, and so on, are the things that make Unix so powerful. The fact is that Win32 is far superior in a number of areas (SIDs >> UIDs/GIDs, security descriptors >> {owner, group, mode, [ACL]}, access tokens >> struct cred), but far inferior in the things that really matter to a power user trying to get things done.
I expect that, like Linux compatibility, most of it is not about "apps" but about being better at running in the cloud, where a (virtual) machine or container needs to be as light as possible, and to be configured and a service launched in it as unattended/automated manner as possible. Stripping out the GUI and making command lines work better works towards these goals.
If fact it bothered me more that I couldn't get a Borland like devenv on Linux and had to keep myself happy with XEmacs.
Isn't it strange that today everybody has very powerful GPUs and CPUs and the graphical displays with immense RAM and then using all that to emulate the terminals last existing decades ago appears to be so important, even for something that should be just a secure communication protocol?
Why do we still spend so much energy to decide which console of many decades ago we "support" when it seems that all are flawed, at least compared to what the modern OSes can provide, as soon as the "compatibility" is not needed?
Isn't all that "hardware console" compatibility stuff just a historical accident from the "bad old days" of 300 baud lines between the mainframe and the "terminal" which had a few bytes of RAM total? In the days when e.g. the Thunderbolt 3 can carry 5 GB/s, and the rest of the hardware matches? Why do people still so cling to it? I'd really like to know what I am missing.
In the UNIX world, that's what it gives you - a stream of bytes. Hence things like rsync-over-ssh or git-over-ssh. It also has a port forwarding mode which has special support for X11, which gives you remote windowing over a stream of bytes too.
The main, huge, benefit is that the abstraction is pretty simple, it's discoverable, and you can use the same interface as a human. You can also plug any stream-of-bytes into any other stream-of-bytes, whereas API or RPC based systems have to be designed to interoperate.
Yes, structured data exchange is the correct answer. When I have the opportunity to code something from scratch, this is the route I take.
But how often does that happen, outside of toy systems and support utilities?
To elaborate: although an ordinary POSIX pty doesn't inherently have a terminal type - that's entirely down to whatever emulator is connected to the master side - the way the ConPty system translates Console API calls into terminal control codes means that it necessarily needs to pick a terminal emulation, which all actors in the ConPty system are expected to use.
A terminfo database entry would be useful both for applications running on *NIX hosts but displaying on a remote ConPty master somewhere, as well as for porting existing terminal applications to Windows where they will run on a ConPty slave.
As a follow-up question, presumably this means that the SSHD running on Windows as a ConPty master needs to translate between whatever terminal emulation the ssh client is connected to and the one expected by ConPty / ConPty apps (in the same way it must translate between the native ConPty UTF-8 and the remote charset)?
Anyways, this is fantastic. Finally, proper ssh functionality!
This will encourage development of console (text-oriented) apps for Windows, which I hope will be much simpler. Interfacing with the console can be really difficult if you're coming from *nix. Ideally all the WIN32-specific code in, e.g., jq[0], could be ripped out.
[0] https://github.com/stedolan/jq (look in src/main.c)
For asynchronous signals, like SIGINT, Windows create a new thread out of thin air to deliver your app a notification. That's not really all that much better than a signal from a concurrency perspective.
Windows even has APCs, which are like regular signals that are delivered only at explicit system call boundaries.
Every operating system needs some mechanism to tell a process to do something. Windows has evolved an approach that isn't all that different from Unix signal handling.
This is so un-Windows-like.
I guess, too, that this is the end of codepages -- I doubt they'd go away, but there should be no more need to struggle with them, just use UTF-8. You'll still need a semblance of locale, for localization purposes, naturally, but all-UTF-8-all-the-time is a great simplification.
https://github.com/dotnet/coreclr/commits/feature/utf8string
TL;DR of this announcement: We've added a new pseudoconsole feature to the Windows Console that will the people create "Terminal" applications on Windows very similarly to how they work on *nix. Terminals will be able to interact with the conpty using only a stream of characters, while commandline applications will be able to keep using the entire console API surface as they always have.
Without this, I/O redirection is slightly broken. Last I checked you can't change where stderr goes after the process starts, for example. [SetStdHandle doesn't do it at the right layer.]
https://github.com/Microsoft/WSL/issues/111#issuecomment-238...
Awesome to see it's finally up and running! \o/
``` alias node='winpty node.cmd' ```
With the new ConPTY, will I be able to run native Windows programs directly? If so, that would be huge, winpty (while I'm really thankful it exists) is a PITA in practice, see e.g. https://github.com/Microsoft/vscode/issues/45693.
Do you think that the people who implemented the Windows Console, especially the people working on Windows NT, did not know about Unix? People try different approaches, sometimes they don't work out.
And it's not like Unix is the Word of God, anyway, it has plenty of flaws.
(Yeah, after a long time on internet forums I get kind of touchy after someone copy-pastes the same old and tired line.)
Maybe they knew how a kernel should work though, but weren't the NT guys old VMS guys? That's a totally un-unixy OS actually.
Sometimes people just insist on using stuff that sucks.
HRESULT WINAPI ResizePseudoConsole(_In_ HPCON hPC, _In_ COORD size);
If Microsoft is in the mood to fix old problems, right ^there you've got another old problem: its bizarre API that is different to everything else. Designed that way to lock everyone into their OS.In 2018 nobody has the time to learn this. Just use a cross-platform API and if it doesn't run on Windows then just don't run Windows.
As a developer, using Windows for development is against your own best interest. If you like to be treated as a dog that is not allowed inside the house, use Windows.
Commandline applications on linux rely on a TERM setting (with termcaps) to be able to know what VT sequences the terminal is able to support. On Windows, we only really have one terminal, conhost.exe, and our goal there is to be compatible with TERM=`xterm-256color`. That's why you'll see that WSL has that set as the default term setting.
Now even with ConPTY, when a client writes VT sequences, they still need to be interpreted by conhost. This is because technically, a console application could use both VT and the console API, and we need to make sure the buffer is consistent. So clients should still assume that they should write out `xterm-256color` compatible sequences.
Now on the other side of thngs, the "master"/terminal side of conpty, we're going to "render" the buffer changes to VT. Fortunately, we'd dont really need a deep VT vocabulary to make this possible, so the VT that's coming out of a conpty is actually pretty straightforward, probably even vt100 level (or I guess vt100-256colors, as insane a termcap that would be).
It's definitely a future feature that we'd like to add to make conpty support multiple different TERM settings, and change the sequences we emit based on what the terminal on the other side is going to expect.
We haven't really gotten into the nitty gritty of all of this quite yet, so if you find bugs or have feature requests, we're happy to take a look at them. You can file issues on [our github](https://github.com/microsoft/console) and we'll add them to our backlog
The nitty-gritty can get quite nitty - things like bracketed paste and set window title.
Even worse, sometimes they won't even disable escape codes when they should not be displayed.
I've posted bug reports for very popular software packages whose commandline always output vt102, even when TERM is set to dumb or when run through pipes. That makes grepping for error messages somewhat annoying. In at least some cases these reports were ignored.
Do you have some vision or plans to go well beyond the classic UNIXy style of console and command line? I'm thinking in the lines of projects like DomTerm http://domterm.org/ which could have nice interactions with e.g. PowerShell.
I haven't seen DomTerm before, but it looks pretty awesome. At a glance, it's basically a GUI-fied tmux hosted in Electron? It would be awesome to have in Windows, but wouldn't that just require that DomTerm add support for these ConPty APIs?
In any case, I'm more interested in your proposed interactions. Did you have anything cool in mind? Given that we ship PowerShell on Linux, we could theoretically do some stuff there (including within PowerShell on WSL) before it's hooked up to ConPty
I presume many tools deal with this issue, and do it in different ways. Perhaps it is as simple as making the console itself only appear once there is any output, or a blocking read of input.
Now, I believe that python could have python.exe compiled as a win32 application, then call AllocateConsole as soon as the script called print() or something. If the app was already running in a console, I believe (don't quote me) that AllocateConsole won't allocate a new console for it, but if it doesn't yet have a console it'll spawn one.
So, for example if I was to pipe into 7z.exe, a classic console app, using something like "type mybinaryfile.bin | 7z.exe a -si c:\temp\myarchive.7z" from a ConPTY console, would the VT translation affect the piped stream?
Windows console applications aren't really able to live without being attached to a console. Now, a terminal might be able to implement those features...
actually now you've got me thinking. I'll play around with that idea. Definitely non-committal, but it might be possible in the future.
Input is also tricky - VT doesn't let you express input with as much fidelity as a console app might be expecting, though this we're working on a solution for :)
All commandline clients run attached to a console server, and that server is conhost.exe. Conhost is responsible not only for being the console server, but drawing the actual terminal window these apps run in. So when you alunch cmd or powershell, what you're seeing is conhost.exe "hosting" these console applications.
What we're exposing here is the "master side" of conhost, which will the other applications act as Terminals, like how there is gnome-terminal, xterm, terminator, etc on linux.
We're still working with ConEmu, VsCode, and OpenSSH to get them all over to the new API, with varying levels of adoption in the next few months likely.
Currently, WSL is also using the same functionality, if you open a WSL distro and run any Windows executables (eg `cmd.exe`), they'll run attached to a conpty. I use this as my daily driver.
Source : I have the t-shirt (polo shirt actually).
see [this docs page](https://docs.microsoft.com/en-us/windows/console/console-vir...) for a (surprisingly incomplete) list of VT sequences we support, and how to use them.
2. Who says that in-band communication like Unix is doing is necessarily better? See pastejacking and other shenanigans.
I've been working on a terminal emulator ( Extraterm http://extraterm.org/ ) with some novel features which would dovetail nicely with how PowerShell works. The first is the ability to send files to the terminal where they can be displayed as text, images, sound, etc or as an opaque download. Extraterm also adds a command `from` which lets you use previous terminal output or files, as input in a command pipeline. See http://extraterm.org/features.html "Reusing Command Output" for a demo. This opens up other, more interactive and iterative workflows. For example, you could show some initial data and then in later commands filter and refine it while visually checking the intermediate results.
What I would like to do sometime is integrate this idea with PowerShell and its approach of processing objects instead of "streams of bytes". It should then be possible to display a PowerShell list of objects directly in the terminal, and then reuse that list in a different command while preserving the "objectness" of the data. For example, you could show a list of user objects in one tab and then in another tab (possibly a different machine) grab that list and filter it the same way as any normal list of objects in PowerShell. You could also directly show tabular data in the terminal, let the user edit it "in place" in the terminal, and then use that editted data in a new command. It allows for more hybrid and interactive workflows in the terminal while still remaining centered around the command line.
Extraterm does these features using extra (custom) vt escape codes. ConPty should allow me to extend these features to Windows too.
I would highly recommend you check out the excellent HistoryPx module[1]. Among (many) other things, it supports automatically saving the most recently emitted output to a magic `$__` variable. Theoretically, you could save a lot further back, but you may start to run into memory constraints (turns out .NET objects are a little heavier than text... ;) )
SSL is straightforward compared that, at least, once the keys are set. But ssh... as seen in the OP even the console or the terminal or however that part it called has to be very special, and they are obviously proud they implemented that too. In 2018. Probably decades after the last single hardware terminal was sold.
I think there is probably a lot of room for improvement in the terminal world, and I agree that a lot of the really old stuff makes things a bit counter-intuitive, but for whatever reason, it seems that people who make really good software also tend to be the people who are pretty fanatical about backwards compatibility. Consider vim, for instance.
In the end, I think the basic model of interoperable, small programs that manipulate streams of text is really good - so people will put up with any number of weird rituals to live in that model. It's also very humble, and very unexciting, so it's the kind of thing that's hard to get people hyped about. So it probably filters by the people that like old things.
That is the user perspective, not what has to be in a ssh program to work.
So, if someone wants to make a "dual-mode" app that works as a win32-subsystem app when launched from Explorer and a console-subsystem app when launched from a console, they have to choose between two bad options. They can make their app a console-subsystem app, which means a console will always briefly appear on screen when the app is started (no matter how quickly the app calls FreeConsole(),) or they can make their app a GUI-subsystem app (that opportunistically calls AttachConsole(),) which behaves sub-optimally in cmd.exe.
Maybe the solution is to add a flag (in the .manifest file?) that makes the console initially hidden for a console-subsystem app. That would prevent the brief appearance of a console window when launching a console-subsystem app from Explorer. Then there would be no need for pythonw.exe and python.exe could show the console window only after a message is printed.
MSDN doesn't really say much about AllocConsole(), if that's the right function: https://docs.microsoft.com/en-us/windows/console/allocconsol...
If AllocConsole does behave in the way you say (which I understand it may{, not}), then the documentation sorely needs updating, because right now that bit of functionality (if it is there) is rather implicit.
It would be really cool to effectively deprecate the current console functionality and make it relatively straightforward to use the PTY API going forward, adding the bits people need to support use cases like this (allocating a console when/as it's needed).
Perhaps Visual Studio could introduce a new template for commandline applications that targets the PTY, and put "(Recommended)" next to that one? :D
Please no!
The console subsystem in Windows is what terminal I/O evolved into, in the 1980s. Going back to terminal I/O is a massively retrograde step.
In any case, for addressing the problem at hand removing the console API is not the answer. rossy has explained what the problems actually are, which relate to the command interpeter waiting for processes to terminate and whether Win32 program image files are marked as GUI or console.
Personally I have always regarded that behaviour of Microsoft's command interpreter as a flaw, not a feature. I've always turned it off in JP Software command interpreters, which make it configurable. I didn't implement it in my command interpreter. However, I do appreciate that Microsoft's strong commitment to backwards compatibility hampers what can be done.
IMO, the "real" answer - the viable one :) - is a redesigned console model that supersets/encapsulates what already exists in such a way that things remain backward compatible but incorporate features that allow for progressive enhancement. The question is in how to actually build that out; what to start with, what to do when and where, etc.
Probably the most important thing I'd start with is having each console be like an independent terminal server: make it so anything can watch all the stdio streams of everything attached to the PTY, make it possible to introspect the resulting terminal stream, etc. Then the TTY itself could be queried to get the character cell grid, query individual characters, etc. And also make it possible to change arbitrary PTY+TTY settings [out from under whatever's using a given console] as well.
By "anything can watch" I mean that there would be an actual "terminal server" somewhere, likely in a process that owned a bunch of ptys, and this would have an IPC API to do monitoring and so forth. Obviously security and permissions would need to be factored in.
But this would roughly take the best from the UNIX side (line disciplines are kind of cool, having the PTY architecture Just Work with RS232 is... I understand the history, but it makes for an interesting current-day status quo, IMO), and then combine this with the best of the Win32 side (reading what's on the screen!!!!! Yes please!!).
I'm not sure how to build something sane that could incorporate graphics though. ReGIS and Sixel are... no. 8-bit cleanness is an unfortunately-probable requirement for portability (at least UTF-8 can be shooed away in broken environments with a LC_ALL=C), but base64 encoding is also equally no. Referencing files (what Enlightenment's Terminology does) is a nononono. w3m-image's approach of taking over the X11 drawable associated with the terminal window is awesome in its hilarious terribleness. The best I can think of is a library that all image/UI operations would be delegated to, which would do some escape-sequence dances with the terminal (and whatever proxies were in the way?) to detect capabilities and either use out-of-channel communications (???) to send the image data over rapidly, or alternatively 8-bit or 7-bit encode the image data into the TTY stream (worst case scenario).
This is a bit hazy/sketchily laid out, but it's something I've been thinking about for several years. When I started out pondering all of this stuff circa 2006 I was most definitely all over the place :) I'm a bit better now but I still have a lot of unresolved ideas/things. I'm trying to build a from-scratch UX that provides a more flexible model to using terminals and browsing the web, but in a way that's backward-compatible and not "different to the point of being boring".
I've (very slowly...) come to understand that slow and progressive enhancement is the only viable path forward (that people will adopt), so I'm trying to understand the best way to do that.
When the user's (programmer's) intent is to run a program with no console window, then that's what they should get: no console window.
There's also some funky stuff about explicit AllocConsole-allocated consoles; for example, when you attach a native debugger, all output from such console is automatically redirected to that debugger (i.e. the VS Output window or similar). This is very annoying in practice.
Programmatic access to scroll back is useful for a few things. For example, back when I was on Windows Phone, I wrote a compiler wrapper that would scroll back to the first error message.
It'd be nice for the POSIX terminal world to standardize on similar scrollback access. I know the zsh people would love it.
It's not the most elegant solution, but we're still very early in on this project. We still have lots of improvements to be made to the infrastructure and translation, and even as I type this up, I'm thinking there's probably a better way of handling alt buffers.
* https://docs.microsoft.com/en-gb/windows/console/console-scr...
* https://technet.microsoft.com/en-gb/library/bb497016.aspx
* https://technet.microsoft.com/en-gb/library/bb463219.aspx
* https://news.ycombinator.com/item?id=12866843
* http://jdebp.info./FGA/interix-terminal-type.html
And Microsoft owns it.
Mind you, ptys + tmux/similar is certainly very good, and if that's all we'll get that's still way way better than the current state of affairs, but if that's all that will be possible it should at least be possible to pause the console's output (and flow-control the console application).
There are already CONIN and CONOUT objects. One simply needs to make the latter a synchronization object, the former already being waitable. The console screen buffer maintains a dirty rectangle of cells that have been dirtied by any output operation, be it high-level or low-level output. It becomes signalled when that dirty rectangle is not zero-sized, the cursor is moved, or the screen buffer is resized. And there's a new GetConsoleScreenBufferDirtyRect() call that atomically retrieves the current rectangle and resets it to zero.
With that, capturing console I/O is simply a matter of waiting for the console output buffer handle along with all of the other handles that one is waiting for, getting size/cursor info and clearing the dirty rectangle with GetConsoleCursorInfo()/GetConsoleScreenBufferInfo()/GetConsoleScreenBufferDirtyRect(), and reading the new cell values of the dirty rectangle with ReadConsoleOutput().
The difference is important, since in the traditional MS model, each program that wants to do the remote thing needs to essentially implement its own client-server setup, albeit with a massive amount of help from various runtimes. Named pipes and central authentication made this approach not quite as horrible as it sounds.
This new API is a departure from this model. It will make it possible to just remote via text streams. Perhaps that's uglier --- everyone knows in-band signaling is fragile. But long experience shows they just remoting the damn text streams is easily the more pragmatic option.
The fragility of in-band signaling in the TTY world is not *nix's fault or anything. VMS had that too, since VMS too had to deal with TTYs. The fact is that a) the system evolved from real, hardware TTYs of the 1970s, b) using a text stream with some in-band signaling doesn't even half suck -- mostly it rocks, and all you have to do for it to rock is get the right $TERM value and not output binary files to the tty.
You can set various limits, though I haven't seen functions to stop/resume a job.
This feature is mostly focused on the other end of the communication, on being able to create new Terminal windows to run shells inside of them.
cmd.exe is a shell
conhost and cmder are terminals.
I believe cmder can come with git bash as well, which is also a shell.
The confusion comes from when you launch cmd, the window that appears by default is conhost, with cmd running attached to it. When you launch cmder, it's also running attached to cmd.
This does make coding for utf-8 harder, but when it works is really wonderful stuff.
UTF-8 was designed (as legend has it, on the back of a napkin during a dinner break) after the increase in range and doesn't suffer from the same problem. Additionally, it's straight-forward "continuation" mechanism isn't any more difficult to deal with than surrogate pairs, and it doesn't have any endianess issues like UTF-16/UCS-2.
If not:
UTF-16 is born of UCS-2 being a very poor codeset, as it was limited to the Unicode BMP, which means 2^16 codepoints, but Unicode has many more codepoints, so users couldn't have the pile-of-poo emoticon. Something had to be done, and that something was to create a variable-length (in terms of 16-bit code units) encoding using a few then-unassigned codepoints in the BMP. The result yields only a sad, pathetic, measly 2^21 codepoints, and that's just not that much. Moreover, while many codesets play well with ASCII, UTF-16 doesn't. Also, decomposed forms of Unicode glyphs necessarily involve multiple codepoints, thus multiple code units... Many programmers hate variable length text encoding because they can't do simple array indexing operations to find the nth character in a string, but with UTF-8, UTF-16, and just plain decomposition, that's a fact of life anyways. If you're going to have a variable-length codeset encoding, you might as well use UTF-8 and get all its plays-nice-with-ASCII benefits. For Latin-mostly text UTF-8 also is more efficient than UTF-16, so there is a slight benefit there.
Much of the rest of the non-Windows, non-ECMAScript world has settled on UTF-8, and that's a very very good thing.
UTF-8 uses a variable length encoding that allows for more characters-- if restricted to four bytes, it allows for 2^21 total code points; it's designed to eventually allow for 2^31 code points, which works out to about 2 billion code points that can be expressed.
(Granted, this is all hypothetical-- Unicode isn't even close to filling all of the space that UTF-16 allows; there aren't enough known writing systems yet to be encoded to fill all of the remaining Unicode planes (3-13 of 17 are all still unassigned). But UTF-16's still nonstandard (most of the world's standardized on UTF-8) and kind of ugly, so the sooner it goes away, the better.)
* Your timeline is backwards. UTF-8 was designed for a 31-bit code space. Far from that being its future, that is its past. In the 21st century it was explicitly reduced from 31-bit capable to 21 bits.
* UTF-16 is just as standard as UTF-8 is, it being standardized by the same people in the same places.
* 17 planes is 21 bits; it is 16 planes that is 20 bits.
https://www.joelonsoftware.com/2003/10/08/the-absolute-minim...
https://en.wikipedia.org/wiki/Comparison_of_Unicode_encoding...
I was confused about this for years, too. But it turns out it's just a problem of bad naming. Happens more in this industry than we'd like to admit.
As other explained, it boils down to UTF-16 being 16-bit, and UTF-8 being anything from 8- to 32-bit. It should have been named UTF-V (from "variable") or something, but here we are.
UTF-8 is a variable-length encoding using up to 4 code units (though it used to be up to 6, and could again be up to 6) each of which are 8-bits wide.
Both, UTF-16 and UTF-8 are variable-length encodings!
UTF-32 is not variable-length, but even so, the way Unicode works a character like ´ (á) can be written in two different ways, one of which requires one codepoint and one of which requires two (regardless of encoding), while ṻ (LATIN SMALL LETTER U WITH MACRON AND DIAERESIS) can be written in up to five different ways requiring from one to three different codepoints (regardless of encoding).
Not every character has a one-codepoint representation in Unicode, or at least not every character has a canonically-pre-composed one-codepoint representation in Unicode.
Therefore, many characters in Unicode can be expected to be written in multiple codepoints regardless of encoding. Therefore all programmers dealing with text need to be prepared for being unable to do an O(1) array index operation to get at the nth character of a string.
(In UTF-32 you can do an O(1) array index operation to get to the nth codepoint, not character, but one is usually only ever interested in getting the nth character.)
Hints:
* You have the wrong country, on the wrong continent.
* It's not as simple in reality as your first answer will be. (-:
It's actually a lot more elegant than you'd expect it to be for a project with as many developers as Windows has
UTF-16: Encodes the entire 21-bit range, encoding most of the first 0000 to FFFF range as-is, and using surrogate pairs in that range to encode 00010000 to 0010FFFF. The latter range is shifted to 00000000 to 000FFFFF before encoding, which can be encoded in the 20 bits that surrogate pairs provide. This is a subtlety that one likely does not appreciate if one learns UTF-8 first and expects UTF-16 to be like it.
UTF-8: Could originally encode 00000000 to 7FFFFFFF, but since the limitation to just the first 17 planes a lot of UTF-8 codecs in the real world actually no longer contain the code for handling the longer sequences. Witness things like the UTF-8 codec in MySQL, whose 32-bit support conditional compilation switch is mentioned at https://news.ycombinator.com/item?id=17311048 .
Not exactly. A conforming decoder MUST reject them.
MySQL’s problem is that, by default, it can’t even handle all valid code points.
> I'm not at all convinced that 2^21 codepoints will be enough, so someday it'd be nice to be able to get past UTF-16 and move to UTF-8
UTF-16 currently uses up to 2 16-bit code units per code point, whereas UTF-8 uses up to 4 8-bit code units per code point, and the latter wastes more bits for continuation than the former. How is "getting past UTF-16 and moving to UTF-8" supposed to increase the number of code points we can represent, as claimed above? If anything, UTF-16 wastes fewer bits in the current maximum number of code units, so it should have more room for expansion without increasing the number of code units.
And as you can see, if you do work out the bits, you find that cryptonector is wrong, since UTF-8 (as it has been standardized from almost the start of the 21st century, and as codecs in the real world have taken to implementing it since) encodes no more bits than UTF-16 does. It's 21 bits for both.
OTOH, UTF-8, as originally defined, can encode 2³¹ codepoints.
I mean, both approaches have their pluses, but the API approach is only ever going to work well for remoting if it is standardized and interoperable. And the installed base of Unix termcap/terminfo programs is huge, so plain old text-with-in-band-controls is not going away anytime soon.
With regards to the comment then: the range downshifting you mentioned is merely a step in the encoding process though -- the code point is still whatever it was. If you read parent comment, it had claimed that, in a surrogate pair, each of the 2 code units encodes 10 bits of the code point... but that would be missing 1 bit when the code points need 21 bits to be represented. That's all I was saying there. The extra bit indicating that it's in fact a surrogate pair isn't some kind of implicit dummy bit that you can pretend isn't encoding anything -- if it wasn't there then clearly it wouldn't be encoding the code point for a surrogate pair anymore.
cmd.exe is a shell, and that's the guy that's parked.
conhost.exe is a terminal, and that's under active development, though it's slower than something like VsCode, because we can't just go adding features as we see fit, we have a LOT of back compat we still need to support.
Fortunately, conpty will allow for the creation of new terminal applications on Windows. If you're looking for a better shell experience on windows, I can point you to powershell or even [yori](http://www.malsmith.net/yori/), which looks pretty cool