Imperative vs. Declarative(latentflip.com) |
Imperative vs. Declarative(latentflip.com) |
SQL and Prolog are both examples of declarative programming, so is Make to an extent. Using a map function doesn't make JavaScript a declarative programming language - it's a functional programming concept, not a declarative one.
That doesn't keep it from being a declarative thing.
In the case of map/reduce, if these functions are implemented in your language (say in jquery) then its not declarative as your implementation is still being specified. If you are actually talking to your environment in terms of map/reduce, then those methods are declarative. On the other hand, if we consider jquery as a part of our environment, perhaps it does make sense to consider it declarataive?
For example, consider the problem of finding the second largest value in a set.
In SQL, I'd do something like:
SELECT MAX( col )
FROM table
WHERE col < ( SELECT MAX( col )
FROM table )
It's pretty readable, and can almost be read in plain english: "Get the next biggest value from the table where it's smaller than the biggest value."How might you do this in Java? http://stackoverflow.com/questions/2615712/finding-the-secon...
But look at all the other ways you can do it in that thread. None of them are very readable. And, they can hide subtle bugs that you won't find just by reviewing the code.
Ruby has a pretty concise example if you happen to know the trick, and that the sort performance isn't miserable (kind of a gotcha question): http://stackoverflow.com/questions/8544429/find-second-large...
This is a very simple example, but as you scale up to more complex problems I almost always find SQL is fewer lines of code, more readable, and far less buggy (SQL tends to either work or not work - I find much more subtle bugs in a 30 line Java algo than a 10 line SQL command).
In Prolog you ask questions. For example: subset([1,2],[2]).
then it goes away and says "yes". Or you want to know if any subsets exist: subset([1,2],B).
B = [] B = [1] B = [2]
This makes it really really nice for some surprising tasks (Windows NT used to ship with a prolog interpreter for setting up the network)
The language has a lot of functionality that Prolog doesn't have and (thanks to a strong typing system) performs much better. It just needs a bigger community to support it.
I worked through the 7 languages in 7 weeks book, and solving a sudoku with prolog blew my mind. I think the first "real" programming I did was a sudoku solver in Excel and VBScript (yeuch).
Its fairly trivial to write a checkers AI in prolog if you can define what you want + need:
double([], []).
double(H | T, H2 | T2) :- H2 is H * 2, double(T, T2). double numbers = map (\x -> x * 2) numbers
or double [] = []
double (x:xs) = x * 2 : double xs
or double = map (*2)
I don't think #2 is more declarative than #1, let alone #3. Then again, I don't think any of these versions is very declarative.Declarative would be numpy:
doubled = numbers * 2The fact that, with unification and backtracking, you can not only get a result for a query, but also "pass a variable" as an argument and get a possible value makes it seem much more like a mathematical expression and less like a computation.
For example, I can define some relations:
parent_of(john, mary).
parent_of(mary, eve).
grandparent_of(X, Y) :- parent_child(X, Z), parent_child(Z, Y).
And then I can simply run a query: ?- grandparent_of(john, eve).
Yes
But I can also make it fill in the value for me: ?- grandparent_of(john, X).
X = eve
'grandparent_of' is not some piece of code, it's an actual declaration of a relation between the terms.Of course, you can do unification and backtracking in other languages, but Prolog is designed for it.
In imperative style, most of your mistakes or carelessness will usually mean that the machine makes a wrong result or crashes in the process - a bad 'what'.
In declarative style, most of your mistakes or carelessness will usually mean that the machine will take a bazillion times less efficient way trying to make that result, possibly taking 'forever' or running out of memory - i.e. a bad 'how'.
That said, the "why did it choose that terrible implementation?" problem does occasionally come up in declarative programming, and inherently never comes up in imperative.
Step #2 is doing whatever is necessary to make that code work. Sometimes this means using the more interesting stuff like reflection, dynamic objects, expression tree visitors, etc. but I find that subsequent issues keep getting easier to deal with. This is because step #1 is naturally building a DSL for your problem domain and you start to find that what you did in step #2 is quite reusable.
I've been programming for a while, so I have experience with the imperative, "write the code that solves the problem" approach and it works too, but I am having fun with the "write the code that describes the problem" approach more.
Just my two cents.
As an example, I turned what otherwise would have been an extremely tedious exercise in writing tons of obscure SQL (creating reports from a very non-standard database layout) into an API for creating reports that is literally like reading the specification for the report. And all of it was done in about 250 lines of C#. And to top it off we still have complete static type checking! I really cannot sing the praises of C# enough.
For some reason, it seems we programmers are adamant that it must be one or the other. Consider all of the examples, either purely imperative or purely declarative. Why not both?
Taking the example from the original article, it would be more akin to:
// Implementation
function double(n) {
return n * 2;
}
// Declaration
[1,2,3,4,5].map(double)
=> [2,4,6,8,10] elements = [1,2,3,4];
doubledElements = doubleElements(elements)
Basically, if you see the words map, fold, reduce in your code, you are probably not as easy to understand as you'd like to think.Of course, in the cooking methaphor, I'm ok with mutating elements and just doing:
elements = [1,2,3,4];
doubleElements(elements);
This clearly has issues if multiple "cooks" are working with elements. But is ridiculously easy to intuit regardless. (Precisely because in real life so many things are changed by imperative commands.)>But we also get to think and operate at a higher level, up in the clouds of what we want to happen, and not down in the dirty of how it should happen.
The author mentions this at the end, but I feel it should be stressed more strongly: The dirty of how is important. The author presents a big "if" here, which is: if the function we've built to abstract away some ugliness performs in the best, most efficient way possible, with no drawbacks, then, yes, abstracting that functionality away and forgetting it is okay.
But to me that's a big if. It is just as important to me to understand and recognize that map is fast, efficient, and to understand why it's fast and efficient, so that someday, if you come across a situation where map does not apply, you will know why, and you'll be able to use something better.
Being up in the clouds all the time is, to me, a pipe dream -- we must always be cognisant of the ground on which we built this tower of abstraction layers.
var genericFilter = function(type, value) {
return function(items) {
return _.filter(items, function(i) {
return i[type] === value;
});
}
};
var sizeFilter = genericFilter('size', selectedSize);
var brandFilter = genericFilter('brand', selectedBrand);
var appliedFilters = _.compose(sizeFilter, brandFilter);
var filteredItems = appliedFilters(items);
// which ends up doing sizeFilter(brandFilter(items));
// edit for sloppy code;Their apprehension of tackling code is one I don't immediately understand, but I do get that they don't want to think about the how, rather the what. It's a funny parallel.
Here's a great video by Bret Victor who saw this problem, and tried to fix it for animation:
PS: What happened to 3GL vs. 4GL?
max(col for col in table if col < max(table))
a fun one is this: first, second = max(permutations(table), 2)
although its semantics are slightly different (if the maximum of the table is duplicated, it'll be returned for both slots)or using heapq which notaddicted mentioned.
In your example, if you replace max() with a function that prints something, you'll see it's executed for each value in 'table', which is extremely inefficient. This happens because Python can't guarantee that max() will return the same each time.
Similarly, while in a side-effect free context the runtime could, for example, slice 'table', perform the work using multiple concurrent threads and then join the result, Python has to guarantee that the execution is done sequentially, since a change in order could affect the result.
Unlike in SQL, in Python you're always telling it HOW you want it done.
sorted(vals)[-2] import heapq
n = 2
heapq.nlargest(n, vals)[n-1]
Edit: and out of morbid curiosity, here is the same implemented by a scan. It could be arranged as a single statement. swapIfGt = lambda xs, q: xs if xs[0] >= q else sorted([q]+xs[1:])
n = 2
reduce(swapIfGt, vals, [None]*n)[0]
Edit2: rewrote swapIfGt maximum $ filter (< maximum xs) xs
or like the Ruby one, but with better error semantics (`atMay` 1) . reverse . sort
since we don't know that there exists such a column.The effect is you end up with a declaration that is highly intelligible, exactly because you don't have to write the implementation.
The ruby method-based form really is the same thing, because the semantics of teh result are constant across different enumerable objects (conventionally, of course, nothing about the language enforces this), but different enumerables may well implement the behavior differently under the hood -- including using internal indexes -- and may, in fact, apply the same type of adaptive techniques used by an SQL-based RDBMS (or, in extreme cases, actually defer all the work to an SQL-based RDBMS where the data presented with an enumerable interface in Ruby actually lives.)
This is less true, perhaps, of some of the other examples in that the implementation will be the same for all objects in the same system (but you still get some of the conceptual clarity advantages of declarative programming, even if the its not leveraged as effectively behind the scenes by the runtime adapting the actual methods used to the data it is working on.)
- given a pan
- given 3 eggs
- given a little of olive oil
then one executes orders : - add the oil in the pan
- break the eggs and add them in the pan
- cook the eggs fo 5 mins
- serve the food hot
so i guess one is never really doing either pure declarative of imperative cooking/programming ?I like the cookbook metaphore for programming.
So, don't get me wrong. I'm all for declarative actions. I just think the "purely declarative" approach that many languages try and impose is a useful aberration when you consider how the vast majority of "programming" is done.
It's not.
> Since declarative programming is side-effect free, the runtime can perform the operation in the most efficient way it knows.
The point of declarative programming is to declare what you want done. The runtime may turn it into greatness or crap, that's not really relevant.
> In your example, if you replace max() with a function that prints something, you'll see it's executed for each value in 'table', which is extremely inefficient.
It's also not relevant and an implementation detail, with a known max() on a known type the implementation would be free to lift the computation out since this expression doesn't have side-effects. Just as a crappy SQL DB would be free to run the subquery once for each result.
> This happens because Python can't guarantee that max() will return the same each time.
The runtime can know that all types and functions involved are their native side-effect-less versions and is free to optimize them if it wishes, that it does not is irrelevant.
> Unlike in SQL, in Python you're always telling it HOW you want it done.
Nope, sorry, you're wrong.
I didn't say the quality of the runtime was relevant. The point is that the runtime is free to achieve the goal you give it in any way it feels fit. In Python, the runtime isn't; it needs to ensure specific runtime guarantees.
>> In your example, if you replace max() with a function that prints something, you'll see it's executed for each value in 'table', which is extremely inefficient.
> It's also not relevant and an implementation detail, with a known max() on a known type the implementation would be free to lift the computation out since this expression doesn't have side-effects.
No, it can't. The Python reference specifies the semantics of generator expressions, and it says "Only the outermost for-expression is evaluated immediately, the other expressions are deferred until the generator is run"[1].
A implementation that lifted the computation would not be implementing Python, but a derivative language.
Furthermore, your example didn't specify neither the type of neither 'table' nor 'max', so I still think it's deceptive.
> The runtime can know that all types and functions involved are their native side-effect-less versions and is free to optimize them if it wishes, that it does not is irrelevant.
See above. According to the language spec, it can't.
But then you lose pureness, right? The whole point of using high-order functions is allowing you to be as declarative as mathematics, so you can just operate functions together.
Consider that in the first example, I only need to write the implementation for doubling a number n, while the `doubleElements` implementation is too specific, would throw the other half of the code back into imperative land.
I suppose I should have said that the doubleElements implementation would likely be that map one liner. (Though, it needn't be. One could exploit custom knowledge of the domain there to do crazy crap like memoize the calls.)
That make sense?
Yes, the first example uses pure functions, but I guess you're confusing pure functions with HOFs [1]. The point is only having to write the implementation to double one number, and extrapolating it by composition. Consider that in your example, for instance, you would need a `doubleHash` function for hashes, and so on.
That is, I'm fine with using both the pure and HO functions. I just think hiding the HOF ones behind a normal function call is usually a big win for readability.
So, to do the full example:
elements = [1,2,3,4]
doubledElements = doubleElements(elements)
function doubleElements(e) {
function double(n) { return n*2; }
return e.map(double);
}
Where I would assume the "reader" code would only have the first two lines. The rest would be behind the implementation layer. If the double function would be used elsewhere, no need to scope it to doubleElements.