Classes Considered Harmful [pdf](web.cecs.pdx.edu) |
Classes Considered Harmful [pdf](web.cecs.pdx.edu) |
The claim that classes don't do object creation because some operator like "new" actually does that is jaw-droppingly stupid.
In the early history of smalltalk Alan Kay talks about how he was never entirely satisfied with inheritance, and having inheritance in the language (quoting in full, hard to link to):
A word about inheritance. Simula-I had neither classes as objects nor inheritance. Simula-67 added the latter as a generalization to the ALGOL-60 <block> structure. This was a great idea. But it did have some drawbacks: minor ones like name clashes in multiple threaded lists (no one uses threaded lists anymore), and major ones like rigidity in the extended type structures, need to qualify types, only a single path of inheritance, and difficulty in adapting to an interactive development system with incremental compiling and other needs for instant changes. Then there were a host of problems that were really outside the scope of Simula's goals: having to do with various kinds of modeling and inferencing that were of interest in the world of artificial intelligence. For example, not all useful questions could be answered by following a static chain. Some of them required a kind of "inheritance" or "inferencing" through dynamically bound "parts" (i.e. instance variables). Multiple inheritance also looked important but the corresponding possible clashes between methods of the same name in different superclasses looked difficult to handle, and so forth.
On the other hand, since things can be done with a dynamic language that are difficult with a statically compiled one, I just decided to leave inheritance out as a feature in Smalltalk-72, knowing that we could simulate it back using Smalltalk's LISPlike flexibility. The biggest contributor to these AI ideas was Larry Tesler who used what is now called "slot inheritance" extensively in his various versions of early desktop publishing systems. Nowadays, this would be called a "delegation-style" inheritance scheme [Liberman 84]. Danny Bobrow and Terry Winograd during this period were designing a "frame-based" AI language called KRL which was "object-oriented" and I believe was influenced by early Smalltalk. It had a kind of multiple inheritance—called perspectives—which permitted an object to play multiple roles in a very clean way. Many of these ideas a few years later went into PIE, an interesting extension of Smalltalk to networks and higher level descriptions by Ira Goldstein and Bobrow [Goldstein & Bobrow 1980].
By the time Smalltalk-76 came along, Dan Ingalls had come up with a scheme that was Simula-like in its semantics but could be incrementally changed on the fly to be in accord with our goals of close interaction. I was not completely thrilled with it because it seemed that we needed a better theory about inheritance entirely (and still do). For example, inheritance and instancing (which is a kind of inheritance) muddles both pragmatics (such as factoring code to save space) and semantics (used for way too many tasks such as: specialization, generalization, speciation, etc.) Alan Borning employed a multiple inheritance scheme in Thinglab [Borning 1977] which was implemented in Smalltalk-76. But no comprehensive and clean multiple inheritance scheme appeared that was compelling enough to surmount Dan's original Simula-like design.
It squarely looks like that to me. I'm not fooled by the transparent ruse of dropping a few names like Smalltalk to appear informed.
Talentless developers who fight and argue are the only real hazard in the world of programming.
Mature trends and languages have attracted decades of developers good and bad, and because they're widely used, they have the misfortune of also providing a home to the largest corpus of terrible code.
Mark my words: in a generation, functional programming will accumulate just as much garbage, inadequately programmed by just as many careless and/or passive-aggressive developers desperately clinging to their jobs throughout the next wave of whatever becomes the next Steve Ballmer stack ranking code review process.
It doesn't matter how many words or ideas you throw at this circumstance. Terrible developers will still manage to commit garbage.
Life finds a way.
Very few with controlled experiments to find out if a way of doing things is better than another.
Every time I mention this, people say that it is hard to do such experiments. Yes, it is hard to actually know things. while it is very easy to make plausible arguments, for almost anything.
For example, the author of this paper can try to rewrite the google chrome browser - a huge C++ project in the "non-Harmful" way.
A new development could use a new paradigm. Same applies to goto vs structured programming, plain procedures vs classes and OOP-style method dispatch, etc.
function makeVector(x, y) {
return {
norm: () => Math.sqrt(x*x + y*y),
// ... more methods ...
};
}
but frozen so that consumers can't change it except by calling a method.It's odd how this is the 'obvious' way to generalize lambda expressions from functions to objects, yet it's rare to see this style, and you always have to clarify that you're not talking about prototypes. (IIRC Emerald, mentioned in the paper, also works something like this.)
Are all languages really equally good? Is C really no more productive than assembly? Is Swift really no more productive than Objective-C? For many higher level applications is Java really no more productive than C? Doesn't using something higher level like Python make certain kinds of tasks easier than the same task in Java?
Since the inception of programming languages, there has been continuous improvement of the existing languages as well creation of new languages, both of which have introduced new paradigms and features to programming. These new features and paradigms often replace older features and paradigms. Not everything works, but much of the time the new features and paradigms have made writing and maintaining good programs much easier.
Unless you believe our current tools are already completely optimal, it seems only natural to try to improve upon them.
So is your argument that programming languages in a generation will be equal productive/un-productive as what we have now? That seems too pessimistic to me. I think our tools will be much better, just as I feel our tools now are better than they were in 2000 or 1990.
If the argument is that functional programming is not an improvement, that's fine, how do you see languages improving? What's the direction they should take?
Languages fill a niche, and outside of their area of utility it's easy to notice weaknesses.
But arguments in favor of functions are similar to those in favor of objects. They both joust for the award of being adept at doing more with less code, in terms of mythical-man-month project scalability.
I've experienced that problem with huge OOP code trees, and I still don't foresee Pure Functional programming supplying a cure for that problem. Maybe I'm just not one of the lucky ones yet. I don't know.
But when I read Pure Functional code, even though it's tighter, with less boiler plate, it doesn't feel like honest improvement. It's like Coke or Pepsi, McDonalds or Burger King.
OOP might yeild thousands of source files and hundreds of lines per some files, but if a Pure Functional project does the same thing with fewer lines of code in fewer files, it's still doing the same thing.
So in a decade you'll probably find the same complaints, in the same places, but at least people will have different hair styles.
- NewtonScript's frames with prototype inheritance. Allows "objects" to consist of very small data structures that point to ROM objects and only use RAM for slots that actually vary, using copy-on-write. The actual implementation had some issues and of course that platform is dead, but the concept was great.
- Dylan's adaptation of generic functions, with multiple inheritance. There are classes but they don't have methods "inside" them. The class hierarchy describes the dispatch for the generic functions. Allows some really nice styles of programming (dispatching on multiple parameters) that are perhaps possible but absolutely hideous in C++.
- Qt's method. The use of a separate preprocessor step and generated code for dispatch was a hack to get around limitations of C++, but although you don't want to look too closely at that generated code if you value your sanity, at a practical level it works pretty well.
In summary, I'd just add that it is a deep and abiding shame of my industry that people wind up in silos learning one model of object-oriented programming and never consider other models that are different and can be cleaner and far simpler to think about.
No copy is available in rust. Overloading a generic function seems like we are talking about interfaces. A preprocessor step again is similar to rust.
But only the second has anything to do with writing in oop
I have found this approach to multiple dispatch leads to code that's much easier to reason about and maintain. My use case is a tensor library where the tensors have a single interface but various storage backend types, which 'interact' with each other. It doesn't make sense to define these interactions through methods belonging to only one class or the other.
There are several major reasons for that. One of them is that we actually need two different relations: 1) membership relation, and 2) inheritance (or inclusion in more general case). Unfortunately, we cannot reduce them to one relation. If we do (in prototype-based languages) then normally we will still distinguish the role of this one relation depending on the context.
Mathematically, we have the membership relation '∈' between an element and a set, and we have the subset relation '⊆'. Just as we need both of them, we need both class instances and classes. In other words, if somebody argues that classes are not needed, then it is analogous to the statement that sets are not needed and it is enough to have only membership relation among elements. It is possible to develop such a theory but then we will get an alternative mathematics. Or we will implicitly treat some elements as sets.
No. You don't need OO at all, let alone bolt-on conceptual baggage like this.
The notion of object oriented programming is completely misunderstood. It's not about objects and classes, it's all about messages. - Alan Kay[0]
[0] From my fortune clone @ https://github.com/globalcitizen/taoup
I'm not arguing for one or the other but that extreme late-binding as espoused in much of the smalltalk way of doing OO only takes you so far. Static reasoning about program structure is helpful and just message passing alone does not give you enough structure to do that.
Often cited, but somewhat meaningless without the context in which Kay originally made that statement.
One should instead consider that object-orientation is "… the insight that everything we can describe can be represented by the recursive composition of a single kind of behavioral building block that hides its combination of state and process inside itself and can be dealt with only through the exchange of messages." - Alan Kay
It is primarily an approach to representing complex problems and systems in code. It is an alternative to modular programming (procedures and subroutines) organized using functional decomposition (structured programming), or an information or data-flow modeling approach. It is not primarily about organizing code.
Just skimmed the 'Cross-Cutting Concerns' in your paper above. Isn't AOP far more general than what you propose there? Please correct any misunderstanding but per the parent based approach you describe -- "cross-cutting concerns are modularized in parent incoming methods and this functionality is injected in child methods" -- you would need to duplicate a CCC e.g. logging in each distinct 'parent'.
Yes, definitely AOP provides much more freedom in injecting behavior to other parts of the code - it is its main goal while COP has different goals.
> you would need to duplicate a CCC e.g. logging in each distinct 'parent'
No, because in COP, "Parents are Shared Parts of Objects" [1], that is, a parent may have many children. Inheritance in COP is inclusion. If a parent implements a method then it will be reused by all its children.
[1] http://bibliography.selflanguage.org/_static/parents-shared-...
Better to separate data types from functions on data types the way it's done in functional languages, and if you still want object-oriented programming, you can support it like Ada does, or as syntactic sugar over the ordinary call syntax like some functional languages do.
Also, it does not mean that the particular C++ implementation of classes is any good.
The answer, of course, was inheritance, and once the question was asked it wasn't hard to see how much better life could be without it. I've been on an anti-inheritance kick ever since.
Now I see how much better life can be without OO entirely, but that's a different discussion.
Generic interfaces are the one single feature that made OO popular for GUI programming; what by its turn is the one application that made OO popular. And it's still the most powerful of the OO concepts (what is shame really).
When you have classes, you have subclassing, and subclassing means two different things at the same time. First, subclasses are the subtyping operation. If you want type B to be a subtype of A, B must subclass A. Second, they provide code reuse via inheritance.
Each of those things is useful, but there's no reason to tie them together. In fact, modern best practices in OO say to favor interfaces and delegation over subclassing, and that's precisely because it separates those two concerns.
As far as I can tell, this is what the paper was getting at - separating delegation of implementation from subtyping. Since this is already what OO best practices suggest, why is it controversial?
I guess "Classes considered a PITA" just didn't make a good title.
http://www.cs.jyu.fi/maspeghi2013/papers.html
specifically:
NOOL seems to stand for "New Object-Oriented Languages", a workshop of the ACM SIGPLAN conference on Systems, Programming, Languages and Applications: Software for Humanity (SPLASH).
You can recover the strong form of ML-style FP (albeit without the nice type inference) if you limit inheritance to abstract classes (thus, all classes are sealed/final or abstract), mark all fields as readonly/final, and use interfaces for destructuring (which addresses problems 1,2 and 3 under the “Alternatives” section).
As an aside, one of the big issues with “classical” OOP is that it enjoins the programmer to create phenomena using logic (i.e. wizardry) rather than of using math to model a system (i.e. science and engineering). Among other problems, this leads to designs that break symmetry (e.g. if modeling a bucket brigade, the perspective of many humans passing by a single bucket should be just as valid and available as the “natural” perspective of many buckets passing by a single human). Of course, appeals to symmetry is question begging, and I don’t have time to litigate that here, but it sure seems like symmetry is key to building scalable systems (where scalability means not just number of users, but also things like number of edits to a codebase).
In summary: Classical OOP seems to require that you either have a prior (well-designed and debugged) model against which you can program, or really, really good taste.
A. Stepanov (http://www.stlport.org/resources/StepanovUSA.html)
To that end, even starting with definitions is often harmful when the definitions themselves are so often tailored to make a particular proof work, and make no sense without such context.
In practice these days often the most effective way when developing OO code is working the "proofs" till you have a set of classes.
Also, some people go for very FP inspired OO code.
"so before we have written a single program in our language, before we know whether shared behaviour will be important in the applications that will be written in it"
read the POODR book, or watch this
https://youtu.be/OMPfEXIlTVE?list=PL5s3t9kPeAN6aDxaSywIbeFJO...
it explains how shared behavior works without code duplication and why people tend to mess it up.
33 number?
t
33 integer?
t
integer number class<=
t
number object class<=
t
The last two lines demonstrates a little of the class algebra possible. It says that if an object is an instance of the integer class, then it also an instance of the number class and of the object class. That is, the set of objects that are integers is a subset of the set of objects that are numbers which is a subset of all objects.I can add new classes which refines categories of existing objects:
PREDICATE: positive < number 0 > ;
PREDICATE: prime-number < integer prime? ;
17 positive?
t
17 prime-number?
t
13.4 positive?
t
Note that both real and integer values can be "positive" which means that "positive" isn't a strict subset of either class. That kind of refinement isn't possible in most oo languages.I find that classes, if nothing else, are a fantastic way to namespace data and methods that operate on said data. For instance, if I have a chair object, I could use a class to describe the chair. I could store the materials it's made out of, the height of the seat, whether it is an office chair, a barstool, or something else. I could also write methods that can say, price the chair based on data about the chair.
In this particular case, I wouldn't subclass the chair at all. There's no point. Every time I need a chair, I would just create a new instance of Chair. That's why chair is an object. It's a self contained thing that I can pass around and pull data out of or mutate.
I don't really use deep inheritance at all in my code. I think the deepest I've ever subclassed something is once. I find inheritance to be useful if I need to be able to create multiple objects that deal with very different things, but perform similar operations on them. A good example of this is Django class based views(it's own controversial topic). I created my own resource class to use for handling API specific requests. The methods of pulling filters out of the query string and serializing database objects are exactly the same for every single request, but the objects that these classes operate on are vastly different. Therefore, I need to specify a serialization schema for each database object type, but I don't necessarily need to rewrite the function that does the serialization. I use my Resource class in order to more easily re-use these generic methods.
Just my two cents.
* Interfaces
* Records
* Abstract data types
* All the above should use foo.bar() syntax
Classes provide those things, which is good, but also provide implementation inheritance, which is bad.
Inheritance isn't bad.
Abusing inheritance is bad.
For the record, you can also abuse composition, polymorphism, operator overloading, free functions, interfaces, records, and abstract data types. I've abused them all and seen them abused as well.
Inheritance is just another tool. Like all tools, we are expected to use them wisely.
Classes are supposed to model sets (of objects), that is, instances represent real objects and classes represent real sets of objects. So the question is whether sets are as real as the members they consist of. For example, if a set of chairs is a reality that we can comprehend and want to represent in the system.
Yet, I think the problem is that we are probably not happy how classes model sets in OOP. They actually do not - and it is a major problem. (Classes in OOP are mostly templates for instantiating objects.) Also, we are not happy with how inheritance works and it also limits possible benefits of classes. But it is not a reason to say that classes per se are harmful. I would say that they are not as good as they could be :(
E.g., @property (readonly) NSString* myName;
And, given all the use cases I have, I'd be amazed if one class of language can do it all. I'm guessing it's impossible.
It's way better for my engineering team to know how to use a bunch of different specialized tools, and how to use them together, than to focus on one type of tool.
Premature optimization ( if that is what classes are ) doesn't mean the optimization is harmful. It could actually be really good.
The quote means, when you find some evil, and track it back, you'd find premature optimization. You can also find awesome code, track it back, and also find premature optimization.
So, you need to prove classes are evil, and then track it back to premature optimization. This paper fails to do that and therefore fails to make the case for harm
In my experience, the vast majority of people who program are doing so not as their full-time job, but out of necessity. This includes scientists, analysts, academics, accountants, etc. If I had to guess, these individuals write way more code as a whole than full-time developers. Adherence to the ideals object-oriented programming is extremely uncommon from what I've seen. I see a lot of copy-and-pasted code in one gigantic class, and a mish-mash of programming styles from wherever they stole their code from. Testing is completely unknown to them.
I think a lot of these considered harmful discussions are carried along by the impact GOTO Considered Harmful had on the programming world, but I don't think they actually address concerns that affect the vast majority of people writing code. They are of obvious interest to Hacker News readers because people here are trying to maximize their productivity and are extremely tech savvy already, but if you are trying to write a language which minimizes the impact that mistakes make on the world I think you should look at where the vast majority of code is being written. The biggest players already have the knowledge to program effectively, it's about creating ways of coding that make good programming more intuitive, and I don't think OO is the problem there.
> Let us also be clear that classes do not “model the real world”. Objects may or may not model the real world, but classes certainly don’t. Although there are many chairs in the real world, there is no “chair class” in the real world.
That there is no precise definition of what a class is and isn't, and that the traditional notion of a class conflates several independent concepts, is precisely the big criticism! To my mind "traditional classes" means inheritance, and recent languages without inheritance (Rust or Go) have tended to not use the term "class".
But ADTs do not support inheritance, so Rust uses Traits instead (comparable to interfaces or typeclasses), that support inheritance of sorts, but without the usual perils of class-based inheritance, like the diamond problem.
That's actually why I love the model of golang and rust. They model the real world more closely.
When you look at a desk, how do you define it?
Well if it's a wooden desk it's actually the type Wood and it porbably has 4 table legs and probably a desk size of x*x. So this is way easier to represent with Subtyping, since a Desk can be made of wooden, but doesn't need to, so with structual inheritance that would mean a lot of boilerplate if you would need to define all type's of desks while in rust you would just add the impl trait to any of the base struct's, etc.
Also deep inheritance can be nasty to debug, even Scala will probably move to a more flat level in their collections library.
Inform7, in its beautifully literal (not to say literate) way, captures this very simply with the way it defines classes (what it calls 'kinds'):
A chair is a kind of thing. A chair can be comfy or hard.
Your favorite armchair is a comfy chair in the living room.
But to be fair, Inform7 is object-oriented because it's a language for describing worlds (interactive fictional worlds, specifically) that are made of objects, and the utility of the metaphor for building, say, a website, may not be as strong.Other languages might use other techniques to capture 'chairness' without having to start off by describing the class of 'chair'. Maybe you define the protocols chairs support (sittability?), or you just rely on dynamic typing and write code that just assumes whatever is passed to it is a chair and can be sat upon, without the code ever needing to be able to divine an object's chairness. That's useful! you can sit on things that aren't chairs, after all. If you can only build objects from classes, though, you can get yourself into the situation where you find yourself having to find a way to compose a 'chair' instance into your 'bed' class so you can allow someone to sit on the bed without having to copy paste code.
I'm not sure why this is the case. Just the other day someone posted a link to a video about Abstract Algebra. In the video they defined "groups" that (to me) are synonymous with Classes in OOP.
However, composition and parametrization require up-front design and are usually only viable for large-ish variations.
As the parent said, inheritance handles the case of refinement, that is, programming by difference, which composition and parametrization handle with difficulty (if planned for) or not at all (if not planned for).
The bad rap inheritance gets is that it is often misused in places where composition or parametrization were appropriate. But that's just the old "it hurts when I poke myself in the eye with a sharp stick": don't do that.
A good, thoughtful design is better, of course.
As the other reply stated, when you mostly want to keep the base class but over-ride a small specific subset of what it does.
E.G.
The base class uses class member functions to do save/load records from a store.
You want to implement a variant of that class which instead works well with a database engine you like.
You can then derive a class and define JUST the storage and recall functions, and inherit everything else.
Go interfaces have the part about keeping the code around it generic, but I don't think it gets the easy reuse of code by just writing the hooks. (But I can easily be wrong here - I'm not a proficient Go programmer.) In that, it looks very like Java's interfaces (the literal Interface thing).
1 - In Java you would not use a literal Interface for that, since you can't provide default code for the subclasses to extend.
Class hierarchies are inherently unfriendly to strong, sound type systems. Interface composition isn't.
Composing interfaces is clean and straightforward. Composing classes isn't, and you run into things like the diamond problem.
The statement that it is better to separate data and behavior is far from self-evident IMO, and I would be very interested to understand why this is such a deeply held belief. Seriously, why?
Data is and always has been more valuable and more important than behaviour. It's not unheard of to preserve accounting data that's decades old, even though the accounting system itself changed completely many times. The converse, preserving old programs to run on brand new data is much more rare.
The whole hoopla around big data should make it clear where the real value is. Behaviour is merely a way of transforming data, and coupling the two rarely works well in the long run. The only times it's actually advantageous is to preserve invariants that ensure input or output data is well formed. For instance, data structures that ensure or preserve orderings, etc.
Personally speaking, it's just a lot easier. I could go on about various theoretical justifications, but at the end of the day it's just easier. This probably has something to do with the fact that you don't really gain anything by defining data and behavior together, but you do lose flexibility by coupling those possibly orthogonal concerns.
Or let the compiler do it for you? It's like saying I shouldn't use first-class closures, because they're also trivial to implement manually where I actually need them.
Structs are almost exactly like the C-objects of the same name: they define layout in memory for a data structure with discrete types as fields, and there are various packing options available. However in addition to this a struct-specific implementation can exist which defines static and instance based methods. As there is no inheritance in Rust, a struct's "impl" is unique to itself.
Polymorphism is implemented via the traits system. A trait defines a set of methods that objects meant to have said trait must implement (unless a default implementation for a given method in the trait exists). Any struct can have an implementation for a given trait, so these can be considered a (rough) analog to abstract base classes/interfaces (e.g. C++ classes that are either pure virtual or have virtual functions with default implementations).
In practice, one defines an interface in C++ by having a (hopefully) stateless class with pure virtual method declarations, and then classes derived from this class must implement these methods (in order to instantiate them anyway). In Rust one defines a data structure (either a struct or enum) and then, separately, writes the "impl" for it for a trait.
The use of closures in functional programming is an implementation detail, not a fundamental aspect of the underlying theory, and closures only "couple" data and behavior in the very literal sense that it is a function that happens to point to some data under the hood, not in the syntactic or semantic sense we're referring to when we say that OOP classes couple behavior and data.
A closure in a pure language can't couple to data any more than a lambda expression without free variables (from the programmer's perspective), because they can have the exact same type and semantics.
In other words, there's no semantic difference between "f = \x . x + 2" and "f = let y = 2 in \x . x + y" even though the latter is a closure in the absence of inlining.
Ah, well there's the confusion. A language with semantically impure closures emphatically does not reproduce the semantics of the lambda calculus unless you restrict yourself to only pure, immutable variables. But if you reproduce the lambda calculus, there's no need for closures as we know them. It's just an implementation that happens to work well. You could supercompile the lambda-abstraction at the application site and avoid having a closure. It's just not a good idea.
But I agree, in a language with mutable variable capture, you can reproduce the semantics of OOP by providing a bundle of functions closed over mutable references to the "object" data.
> a language that bills itself as functional that doesn't provide that capability natively
I'm not sure exactly what capability you're referring to. Could you clarify a little bit?
What you describe for C++ is typically the case (and I've written completely unsafe, non-portable code that exploits this fact before), but I'm unsure whether that's required by the standard.
But, given the case where you want dynamic dispatch, then in a sense, they are, yes.
(* foo.ml *)
let bar =
let foo = lambda x . lambda y .
if y == 0 then { getFoo : lambda z . 0 }
else { getFoo : lambda z . (x * z - x + z) / y }
end
in foo 3
(* bar.ml *)
include foo.ml
let y = bar 5
bar.getFoo(4)
It sure seems to me that bar is hiding the 3 from bar.ml (and there are much more complicated examples where foo.ml really can't get at the captured type, for instance), and at the same time it's capturing the value of 5. The function getFoo on bar is bound to it (the record type can desugar to more curried lambdas, if necessary, it's just tedious and this illustrates my point better).None of this seems to have anything to do with mutation, to me, though it certainly seems to have both information hiding and binding logic and data. The fact that it can be compiled not to use closures (using whole-program compilation) is immaterial; the same is true for the same program specified with objects in mutable languages (SML has both mutation and closures, and MLton does exactly that).
It really just depends on what exactly you're trying to do.