Pattern Matching for Java(cr.openjdk.java.net) |
Pattern Matching for Java(cr.openjdk.java.net) |
So something like:
for(None n: expreTree.children()) {
intMath.eval(n);
}
will not work as expected if eval is overloaded.I hope it does go forward, though. Really useful feature, and given the constraints of the existing language, the proposed syntax looks pretty liveable-with.
Am I really the only one?
Or do you all just have buckets of source code filled with if instanceof then cast just screaming for this language feature?
Also how the heck do you destructure Java objects with private fields without resorting to bean accessors?
So not an invention of scala, only that is also lives on the JVM.
You can see it live! [1]
Edit: Just got to the bottom of the article. Looks like sealed hierarchies is exactly what they explore.
functionThatReturnsAnyVal match {
case i: Int if i > 2 => System.out.println("Greater than 2")
case i: Int => System.out.println("Less than or equal to 2")
case s: String => System.out.println("Hello, " + s)
case _ =>
} Object o = functionThatReturnsAnyVal();
if (o instanceof Integer) {
Integer i = (Integer) o;
if (i > 2) {
System.out.println("Greater than 2.");
} else {
System.out.println("Less than or equal to 2.");
}
} else if (o instanceof String) {
String s = (String) s;
System.out.println("Hello, " + s);
}
I get that the case syntax is kinda nice, but this particular example just doesn't seem to get there for me. Roughly half the lines, which is good. None of them hard to reason about. Which makes it a wash.Or is the comparison to something else?
Javas strength has always been to take the good parts of its competitors after they've checked out in production (and not just "wouldn't this be a great idea ..?") and implement them. It will probably never be ahead of the curve due to this, but at least it is remarkably free of "looks good in theory, useless in reality"-features.
It always amuses me to see features of ML, a language from 1973, described as "modern"; e.g. algebraic datatypes, pattern-matching, type inference, parametric polymorphism, etc.
C (from which many popular languages like C++, Java, C#, PHP, etc. are derived) came out in 1972.
I wouldn't say it's a case of being "modern", so much as paying attention to what else has been tried before, rather than sticking with what one already knows (when designing a language).
Each type has a `.Match` and `.Switch` methods, in to which you have to pass lambdas to handle each case `.Match(Func<T0, TResult, ..., Func<Tn, TResult>`.
I don't know if this would work in Java, given the generic type erasure, but it might...
FYI, I recently found out that C# can actually do multiple dispatch
https://blogs.msdn.microsoft.com/shawnhar/2011/04/05/visitor...
Actually the proposed java version is better than C#'s because it allows a form of exhaustiveness checking which C# doesn't even attempt to provide for.
However they had to add the `sealed` keyword too, which does not currently exist in java
Please use a Scala unapply() and not just constructor parameters. The former gives much more latitude to build patterns.
I have a ton more to say about all of this having written in Scala and Kotlin extensively. I'll just say that it is useful they are already thinking about new variable assignment to parts of matches and NOT concerning themselves w/ "smart casts".
The authors in the article even address this point, stating that some operations might make sense as instance methods if they are instrinsic to the type hierarchy, but others, especially ad-hoc queries, are extrinsic to the types and best expressed as pattern matching.
Simply calling this "bad code style" sounds a lot like a justification for only having a verbose way to use an alternative to virtual method dispatch.
1 - case classes / value classes / data classes, whatever you want to call them.
2 - match-and-bind syntax
...
11? - fancy pattern matching
This maybe says more about how much I pay attention to what's upstream in Java, but I found the fact that this is just a hypothetical proposal, in April of 2017, strangely shocking. I guess I figured it had to be on the docket for a future java version already.
This still works without case class extractors and is useful. That's what I mean by matching and binding.
Extracting/destructuring in the match statement can sometimes be more trouble than it's worth. It's definitely brittle to changes in the case classes. Then there's the question of the (difficult to reason about?) cost of the abstraction.
I liked that this article laid out the specific options and various shades of gray that can constitute parts of "pattern matching".
I'm curious for thoughts on a syntax like this:
x = match y:
case > 3: () => "it's bigger than three"
case 2: () => "it's strictly equal to two"
case Integer: (x) => `some int smaller than two: ${x}`
case String: (s) => `some string: ${s}`
case Array: ([ first, ...rest ]) => `a list starting with ${first}`
case Object: ({ a, b }) => `an object with property a: ${a}`
This is somewhat more difficult given that even when using Flow or TypeScript, there's relatively little type granularity available at runtime.Any thoughts?
* heredoc/multiline string/embedded interpolated strings * concise array, map, and object literal initializers. * structural/anonymous types
Simple things like multivalue return become an exercise in boilerplate in Java. As much as I like Immutables.org or @AutoValue, I should be able to return a struct or use destructuring operations at the callsite.
The hack in Java is to use annotation processors to provide nice fluent builder patterns, but really the language should have first class support for this.
http://openjdk.java.net/jeps/269 Just a few more month.
Still nothing for the other two, but 1/3 is better than nothing.
<T> T match(Function<X, T> foo, BiFunction<Y, Z, T> bar);
* in the cases where I don't have control over the common base type, I've written a builder pattern that constructs a sequence of predicate-function pairs and applies them appropriately. This looks like so: new PatternMatchingHelper()
.append(Foo.class, foo -> doSomething(foo.x))
.append(Bar.class, bar -> doSomethingElse(bar.y, bar.z))
.otherwise(x -> fallbackValue())
.apply(objectOfUnknownType);
The main disadvantage here is that it doesn't work well with generics, because the class objects have raw types.You could probably extend the second approach to get something close to the arbitrary predicates that pattern matching would provide, but the syntax wouldn't be nearly a clean as having it in the language.
- Type inference
- Everything is an object (no more primitives)
- no obligatory ';' as a statement separator
- Type-classes
- Scoped import statements
At which point we're actually recreating Scala and we might as well switch :)
I wonder if Oracle could create some excitement around the platform by creating or adopting a new language as an official repla^D^D^D^D^D "new member of the family" to implicitly succeed Java just like C# replaced Visual Basic for most use cases. Java could be kept around as a sop to die-hards like VB.NET was. It's great that the JVM allows a thousand flowers to bloom, but it's not great that the only "official" choice on the JVM is a language that hasn't been able to evolve very far from its 1990s roots.
EDIT: And to your point, I think it's a bad sign if their most profitable customers think that none of the Java language updates they've released in the last ten years is worth upgrading for. The longer their customers stick with 2006-era Java, the longer they pass up Oracle's current offerings, the less attached they feel to the future of the platform, and the more likely they are to make a big change when they do make a change.
if (is String name) {
// compiler knows 'name' is String in this block
print(name);
}
else {
print("some other text");
}
Why declare a new variable? The Ceylon syntax works great with union types too.https://news.ycombinator.com/item?id=12971841#12972691
In C#, the "wanting the supertype" answer makes some sense with explicit interface implementations [0]. Java doesn't have that, and AFAIK there's no way to declare something that's visible on a supertype but not a subtype.
[0] I think it's a bad reason. But the possibility does at least exist.
It generates code for algebraic data types which offer a typesafe interface similar to the `case` statements in languages that natively support pattern matching.
sealed trait Either[A, B]
final case class Left[A, B](a: A) extends Either[A, B]
final case class Right[A, B](b: B) extends Either[A, B]
and it is still not a real sum type due to exposing subtyping (cf. https://github.com/quasar-analytics/quasar/wiki/Faking-Sum-T...)with derive4j:
@Data
interface Either<A, B> {
<X> X match(Function<A, X> left, Function<B, X> right);
}
or, at your preference: @Data
interface Either<A, B> {
interface Cases<X> {
X left(A leftValue);
X right(B rightValue);
}
<X> X match(Cases<X> cases);
}
So it is actually not much boilerplate.The real drawback of (any) java implementation is lack of TCO.
instanceOf x is? {
Int { System.out.println("It's an Int"); }
String { System.out.println("Hello, " + x); }
_default { System.out.println("This type is not explicitly named here."); }
}
And if you don't want the type, the value of any other method, function, or attribute could be checked by "is?" or whatever syntax token you want there.Further tests could be saved for within those blocks to save complexity.
What's odd though is that in Java this particular example seems to want multi-method. Testing the type explicitly and acting on it is more akin to duck-typed language programming. You see this pattern pretty often in Perl for example. If I want to write in Perl, I typically reach for Perl.
They're talking about making the Pattern matcher match more complex patterns.
Paper: http://www.cs.cornell.edu/~chinawat/papers/pldi13-p343-israd... Slides: http://www.cs.cornell.edu/~chinawat/papers/chin-pldi13-slide...
Is this addressed with pattern matching?
As you've found, overloading is statically typed multiple dispatch; discovering the right overloaded method at runtime is functionally equivalent to multiple dispatch if your overload resolution rules prefer more specific derived types (rather than simply giving up with ambiguity).
What I want to do is create proper sum types. For some reason most OO languages were designed around the idea that being "open for extension" was a good thing. Normally if I make a base class, I want a closed and known set of sub classes. E.g. if I make a payment method then I want to make a Credit or Invoice kind where one requires Credit Card details and the other requires an address. In F# that's a simple sum type. In C# making a sum type is a convoluted mess of making an outer base class and private nested sub classes.
Here a sum type with the two cases plus an exhaustive pattern matching switch would have solved in 10 lines what takes me 300 lines to do in C#. I think the concept of "fixed" or "closed" hierarchies is foreign enough to OO that it probably needs to be a completely separate concept from the normal types. F# shows that it can be implemented as sugar on top of the regular CLR type system. All that's needed is that some keyword such as "enum class" is converted into a sealed hierarchy of plain record classes with no methods on them. The switch statement we already have can then treat these "enum classes" specially by checking that all cases are matched.
A sketch solution for C# could look like
enum class PaymentMethod
{
CreditCard(CrediCardDetails cc),
Invoice(Address billingAddress),
}
// later
switch (paymentMethod)
{
case CreditCard(ccdetails):
Console.WriteLine($"Creditcard expiring {ccdetails.ExpiryDate}");
case Invoice(addr):
Console.WriteLine($"Bill to {addr.Name}");
}
Writing the above in current C# would require significantly more boilerplate.
This isn't the best pattern to use in all situations: but in many cases when the types are merely data carriers it doesn't make sense to put the logic in the class as OO prescribes, so the FP recipe works much better.It's equivalent of
_ -> default_function
At the end of a pattern matching statement.Essentially it's impossible for not all the cases to be matched, as far as I can see.
For example, if I am writing a logic system in rust I can either match with one single arm or a nested match implement simple rules like double negation. I don't see how you could easily do this with multiple dispatch like the one you link.
I find myself really wish more languages would implement both the system you link and something like rust's/ML's. Usually it's easier to build multi-dispatch than match syntax though. In rust you can use HashMaps of TypeId -> Box<Any>. I would do a similar thing in C# (though now I am going to use this shiny new feature).
> "I don't want to implement the visitor pattern"
Both multiple dispatch and visitor bring type unsafe dynamism and possible runtime errors. Pattern matching and algebraic types fit well into static type system and are much more safe.
But also good to know, that it can be costly and unsafe. `dynamic` is basically using reflection at run-time, which for most cases is fine, but it's something to keep in mind. It also means that type checking is delayed until run-time instead of compile-time.
I think the safety thing is a non issue as well - what arguments could you pass in to the statically typed function that would cause the use of dynamic to bite you in the ass?
Basically, this is going to be about as fast as you can get with the statically-typed underlying runtime.
What I want is to make a new type added mean the code won't compile until it is explicitly handled in all pattern matches that do not have a default clause. Basically what I'm saying is that this:
bool HandlePayment(PaymentMethod pm) {
switch(pm) {
case Invoice(addr):
SendInvoice(addr);
return true;
case CreditCard(ccdetails)
return PrcessCreditcard(ccdetails);
default:
// What?
}
}
Would be a lot better if it didn't require the default, and the compiler could detect the error that occurs when someone adds a new case (BitCoin) to the types of payment method. Inheritance and abstract baseclass for the processing means that you'd do this bool HandlePayment(PaymentMethod pm)
{
return pm.Process(order); // sends an invoice, charges credit card etc
}
and you simply have
abstract class PaymentMethod { abstract bool Process(order); }
class CreditCard : PaymentMethod { ... }
class Invoice : PaymentMethod { ... }
And this is of course the regular way of writing OO, but I find it unnatural and cumbersome in many cases. It's hard to define concise descriptions of what types are actually available, and you have to write a ton of boilerplate to ensure a closed hierarchy and complete matching in all scenarios where you can't enclose the logic inside each type.http://pzol.github.io/getting_rusty/posts/20140417_destructu...
It's not just about finding the right bit of code to run; it's also about pulling some fields out of the polymorphic value so they're available as nice short names, without qualification.
C# is a weird beast though - it's pure OO on some level, but then keeps making getters and setters even easier to write. It's easier to write an "object" where every field has a getter and setter, than it is to write an actual constructor for private variables, so that's what people do.
Not having first class multiline literals really makes it painful to write something like React or GraphQL in Java.
One of the things I give props to MS for is having the courage to evolve C# and TypeScript at a breakneck pace. Each release of TS, they add features I've been wishing for. They seem to be listening to developer productivity issues and responding.
I under Java's desire to preserve backwards compatibility, but they end up breaking it anyway, and make design decisions based on how they can shoehorn things into the older JVMs, but each new release ends up with some issues. Witness how long it has taken folks to migrate to Java8. If we're going to wait years for major releases, then let's make them well worth the trouble when they finally arrive.
Certainly. And that's one of my biggest pain points with Java, but a different point in your list. One day they will exist ... one day. Hopefully.
> Witness how long it has taken folks to migrate to Java8. If we're going to wait years for major releases, then let's make them well worth the trouble when they finally arrive.
I've used every release since JDK 6 in production one or two month after it was released. Once that hurt me (Rhino/Nashorn), but usually it hasn't been a problem. I think people who wait until the last moment when the older versions go out of support to switch JDKs do themselves a disservice out of fear. And fear is never a good reason.
The example they give for notification = email | sms | voice looks a lot like what I was trying to sketch with the paymentMethod example.
http://docs.scala-lang.org/tutorials/tour/case-classes.html
I didn't intend for my proposal to be more limited than than the Scala case classes at least :). I want exactly that. An FP closed type hierarchy with enforced pattern matching.
They also have more configurability - while a case class provides default implementations for things like equality, you can override those (this could be a pro or a con depending on whether you prefer flexibiltiy or performance, I guess - although arguably the flexibility should only have cost if you use it, assuming it's implemented well).
It's definitely not a huge difference, and arguably Scala can suffer from it's philosophy of providing tools that can do something rather than tools for something that means the language often comes across as huge and as having too much stuff in it. I like it, but it's not the C# way right now for sure.
Can you expand on this? I would not have expected that it would be faster than compiled expressions.
[1]: http://geekswithblogs.net/simonc/archive/2012/07/20/inside-t...
When does this misconception disappear?
You don't need to be good at math to use functional programming. It's nice that there is a correspondence between math and FP, but it's mostly irrelevant when coding. You could as well say you need a FP background when learning math. Both statements are nonsense and usually spread by people who mostly read complicated blogposts instead of writing actual code using FP.
FP just offers a nice bunch of intuitive and predictable ways of processing information.
I don't agree with this. Many programming concepts outside functional programming can be very hard to grasp, yet we expect students to pick them up, and we expect front-end Web devs to use (some of) them every day.
Some examples off the top of my head:
- Pointers (e.g. the many incompatible meanings of "" in C).
- Classes and instances.
- Implicit state.
- Mutable global state!
- Concurrency (events, continuations, actors, etc.).
- Multithreading. Yes, it's a form of concurrency, but it also offers a unique combination of being a) the default go-to strategy for implementing concurrency, and b) so fundamentally hostile to any attempts at understanding.
- `x = x + 1`. This can be completely baffling when first encountered!
In this context, functional programming doesn't really require anything particularly difficult. Maybe recursion is hard to grasp, but that's not unique to FP; FP just uses it more often, making it harder to avoid. The only thing I can think of that's pretty much a requirement in FP, that we don't already require everyone to grasp, is first-class functions; and they're already pretty widespread in scripting languages.
What about monads? What about denotational semantics? What about cartesian closed categories? What about all that other complicated math stuff? None of it is needed to do FP*! Just grab a small Scheme interpreter, ignore "call/cc" or anything ending in "!", and start coding.
Let's look at, say, the accepted papers from the last OOPSLA (Object-Oriented Programming, Systems, Languages & Applications): http://2016.splashcon.org/track/splash-2016-oopsla#event-ove...
Looking through the titles, I don't know what an "enclave" is, or "conditional future synthesis", or "first-class effect reflection". I have no idea what a "non-linearizable concurrent object" is, or what constitutes a "dependent object type". Does that mean I'm not good enough at math to learn OOP? Not at all; I can just run `python` and start hacking.
I suppose it's a matter of opinion, but to me the comparison of these two snippets is nowhere near "a wash". The readability of the former is leaps and bounds ahead of the java version, and it will only be more apparent as the example grows in complexity.
And let me be clear. My gripe is only with the example. Not the idea. I happen to like pattern matching, but have never used it like those. Usually I use it to tease apart data by parts. (though, to be fair, I have found I use it less than I thought I did. Not sure why...)
val myOptionalVar: Option[String] = functionThatReturnsOption()
myOptionalVar match {
case Some(str) if str.length > 10 => System.out.println("This is a long string.")
case Some(str) => System.out.println("This is a short string.")
case None => System.out.println("This code would be fifteen lines of null checking in Java.")
} final Optional<String> myOptionalVar = functionThatReturnsOptional();
final String output = myOptionalVar
.map(str -> str.length() > 10
? "This is a long string."
: "This is a short string.")
.orElse("This code would be fifteen lines of null checking in Java.");
System.out.println(output);It's way less code and it makes the resulting code easier to reason about. The code's intent is more elegantly conveyed.
:)
I think that Alan Key spoke about dynamic nature of OOP, that's one of the examples. Even in the statically typed environment OOP demonstrates its dynamic nature.
class Expression {}
class Add : Expression {}
class Const : Expression {}
class Var : Expression {}
void FSpecialization(Expression e1, Expression e2) { ... }
void FSpecialization(Const c, Var v) { ... }
...
void F(Expression e1, Expression e2)
{
FSpecialization(e1 as dynamic, e2 as dynamic);
}
How is this not exhausted? Any new sub class of expression will be routed through to the first FSpecialization. Same as it would with a _ -> default_f
in algebraic pattern matching.Because what happens if you introduce a new expression, but not realize this means the implementation of F should be updated?
Regarding algebraic pattern matching, you can let Haskell check exhaustiveness and then drop the default case. The compiler will then warn you if you forgot one.
To give you an example, in our codebase we have an enum that is used 3068 times in one solution. A very similar one (you would need domain knowledge to understand the difference) is used 2985 times. Both are not defined in that solution by the way.
It's not out of the question they will receive a new enum member down the line. It would be very useful if the compiler was able to tell you a switch that uses either enum is no longer exhaustive.
If you could exhaustively switch on an enum, you could skip the default: case and have the compiler enforce you covered all cases. (I know, the C# enum==int would make that impossible.)
If you would remove FSpecialization(Expression e1, Expression e2) a.k.a wildcard pattern, compiler would not warn you about your dispatching is not exhaustive. And if you'd add another type of expression, all your dispatchers and visitors without wildcard would become non exhaustive, staying valid code from a compiler perspective at the same time.
It helps almost completely avoid the /get/this /get/that /set/those explosion of API getters and setters that ultimately leads to very complex client logic that makes any notion of consistency at a distance very hard to reason about.
A dynamic "type" is really just `object` whose methods are looked up at run time. It's not tagged data at all. See [1] for more info, specifically the example(s) at the bottom, as the reflection code is what gets executed at run (albeit with caching so that methods aren't looked up all the time).
`dynamic` is a complex feature that enables multiple dispatch as a side-effect, but only because it allows a whole lot more.
[1]: https://visualstudiomagazine.com/Articles/2011/02/01/Underst...
> what arguments could you pass in to the statically typed function that would cause the use of dynamic to bite you in the ass
public void Output(int value);
public void Output(Person person);
...
dynamic aValue = "Some String";
Output(aValue);
Compiles, because the actual resolving of which overload to call is done at runtime, not at compile time. And at runtime, there is no overload that accepts a string.Dynamically typed languages usually represent objects as something like a struct with an int tag to represent the type, and a void pointer for its value. The actual reflection going on in the code I posted for multiple dispatch would surely be nothing more than an int comparison - same as what Ocaml probably does.
Ah, I misunderstood the question. Yes, in that case it's fairly type-safe.
> The actual reflection going on in the code I posted for multiple dispatch would surely be nothing more than an int comparison
Is this based on gut reaction or are you talking about optimizations that `dynamic` performs? I ask because my understanding is that `dynamic` is like a really efficient reflection-emitter, that attempts to do at runtime the same thing that the compiler would do at compile time, but slower because it has to look stuff up via reflection.
From Eric Lippert [2]:
> The magic is: the compiler emits code that starts the C# compiler again at runtime. The runtime version of the compiler analyzes the call as though the compile-time types of all the objects had been their actual runtime types, generates an expression tree representing that call, compiles the expression tree, caches the delegate for next time, and runs the delegate.
[2]: http://stackoverflow.com/questions/10330805/c-sharp-multiple...
So maybe `dynamic` at runtime figures out that it can do a "a struct with an int tag to represent the type, and a void pointer for its value", but I wouldn't assume it does and from what I've read on it, I wouldn't think that it does.
None if the function is written correctly, but by that logic you don't need type safety at all. The point is that the dynamic technique gives you no warning if you forget to implement one of the cases.
void React(Animal me, Animal other)
{
ReactSpecialization(me as dynamic, other as dynamic);
}
Unless you pass in a casted value here, it's fool proof.But to answer your question - yes, entirely based on gut reaction and what a compiler would ideally be able to do given the code posted. So very probably wrong.
As an OO apologist I do feel compelled to point out that
type Animal = Cat | Dog | Mouse
let react = function (a, b)
| Cat, Dog -> ...
...
| x, y -> $"{x} is not interested in {y}."
Would suffer from the exact same issue though.