Esoteric programming paradigms(ybrikman.com) |
Esoteric programming paradigms(ybrikman.com) |
[1] https://github.com/imatix/gsl [2] https://news.ycombinator.com/item?id=11558007
In Python, PyContracts supports runtime type-checking and value constraints/assertions (as @contract decorators, annotations, and docstrings).
https://andreacensi.github.io/contracts/
Unfortunately, there's yet no unifying syntax between PyContracts and the newer python type annotations which MyPy checks at compile-type.
https://github.com/python/typeshed
What does it mean for types to be "a first class member of" a programming language?
A parallel pair of AND gates is physically concurrent, let alone anything with a clock.
[1]: http://witheve.com
Robust-First Computing: Distributed City Generation https://www.youtube.com/watch?v=XkSXERxucPc
A Movable Feast Machine [1] is a "Robust First" asynchronous distributed fault tolerant cellular-automata-like computer architecture.
The video "Distributed City Generation" [2] demonstrates how you can program a set of Movable Feast Machine rules that build a self-healing city that fills all available space with urban sprawl, and even repairs itself after disasters!
The paper "Local Routing in a new Indefinitely Scalable Architecture" [2] by Trent Small explains how those rules work, how the city streets adaptively learn how to route the cars to nearby buildings they desire to find, and illustrated the advantages of "Robust First" computing:
Abstract: Local routing is a problem which most of us face on a daily basis as we move around the cities we live in. This study proposes several routing methods based on road signs in a procedurally generated city which does not assume knowledge of global city structure and shows its overall efficiency in a variety of dense city environments. We show that techniques such as Intersection-Canalization allow for this method to be feasible for routing information arbitrarily on an architecture with limited resources.
This talk "Robust-first computing: Demon Horde Sort" [4] by Dave Ackley describes an inherently robust sorting machine, like "sorting gas", implemented with the open source Movable Feast Machine simulator, available on github [5].
A Movable Feast Machine is similar in many ways to traditional cellular automata, except for a few important differences that are necessary for infinitely scalable, robust first computing.
First, the rules are applied to cells in random order, instead of all at once sequentially (which requires double buffering). Many rule application events may execute in parallel, as long as their "light cones" or cells visible to the executing rules do not overlap.
Second, the "light cone" of a rules, aka the "neighborhood" in cellular automata terms, is larger than typical cellular automata, so the rule can see other cells several steps away.
Third, the rules have write access to all of the cells in the light cone, not just the one in the center like cellular automata rules. So they can swap cells around to enable mobile machines, which is quite difficult in cellular automata rules like John von Neumann's classic 29 state CA. [6] [7]
Forth, diffusion is built in. A rule may move the particle to another empty cell, or swap it with another particle in a different cell. And most rules automatically move the particle into a randomly chosen adjacent cell, by default. So the particles behave like gas moving with brownian motion, unless biased by "smart" rules like Maxwell's Demon, like the "sorting gas" described in the Demon Hoard Sort video.
In this video "Robust-first computing: Announcing ULAM at ECAL 2015" [8], David Ackley explains why "Robust First" computing and computing architectures like Movable Feast Machines are so incredibly important for scaling up incredibly parallel hardware.
I think this is incredibly important stuff in the long term, because we've hit the wall with determinism, and the demos are so mind blowing and visually breathtaking, that I want to try programming some of my own Movable Feast Machine systems!
[1] http://movablefeastmachine.org/
[2] https://www.youtube.com/watch?v=XkSXERxucPc
[3] http://www.cs.unm.edu/~ackley/papers/paper_tsmall1_11_24.pdf
[4] https://www.youtube.com/watch?v=helScS3coAE
[5] https://github.com/DaveAckley/MFM
[6] https://en.wikipedia.org/wiki/Von_Neumann_cellular_automaton
[7] https://en.wikipedia.org/wiki/Von_Neumann_universal_construc...
The definition I like to use of concatenative language is:
If "X Y" is a legal program, then "X" is a list of tokens and a legal program and "Y" is a list of tokens and a legal program, and semantically, executing "X Y" is equivalent to executing "X" and then executing "Y".
practical implementations often deviate a little from this ideal.
Some of these languages are definitely suitable for "serious" usage. And I'm not sure I'd count SQL (or Prolog, for that matter) as esoteric at all!
Nonsense. The article's premise is explicitly that the listed paradigms will change how you think about coding. The author may or may not be correct about that, but the original title was not random hyperbole to entice a click - it was an accurate description of the article's premise and content.
> "esoteric" carries some quite strong negative connotations
What? It didn't strike me that way. (And for what's it's worth, I'm a native English speaker; I was a communication major, an English minor, and a technical writer; since college I have continued to dabble in writing and linguistics for over 15 years; and as you can see I even know how to use a semicolon. :)Had they said "eccentric," maybe I would have agreed. Esoteric, to me, has more the connotation of a secret society of sages.
Anyway, the article itself uses "esoteric."
By they way, I'm so glad that they declickbaited the headline. Thank you, editor!
First, the article claims: "the sudoku solver above does a brute force search", but that is specifically not the case. In fact, quite the opposite holds: The Prolog Sudoku formulation shown in the article uses a powerful technique known as Constraint Logic Programming over Finite Domains, which automatically performs pruning before and also during the search for solutions. This effectively eliminates large portions of the search space in practice, and degenerates to brute force only in those cases where almost nothing can be deduced from initial constraints. In the particular case of Sudoku, the pruning is especially effective and can in many cases eliminate the search entirely, since the initial constraints (hints) basically determine the whole solution.
Second, yes, it is easy to write an O(n!) search algorithm in Prolog. However, it is almost as easy to implement much more efficient search algorithms in Prolog. For example, here is Quicksort in Prolog:
quicksort([]) --> [].
quicksort([L|Ls]) -->
{ partition(Ls, L, Smalls, Bigs) },
quicksort(Smalls), [L], quicksort(Bigs).
Note how natural the declarative description of "first the smaller elements, then the pivot element, then the bigger elements" is in Prolog. This only requires a suitable implementation of partition/4, which is very easy to implement in at most 7 lines of Prolog code.https://news.ycombinator.com/item?id=13375108
See the "genuine sieve of eratosthenes" paper, and also:
http://augustss.blogspot.com/2007/08/quicksort-in-haskell-qu...
In Prolog, an arguably more natural sorting method is merge sort, and it can (also) be implemented with asymptotically optimal efficiency in Prolog as follows:
mergesort(Ls0, Ls) :-
length(Ls0, L),
zcompare(C, L, 1),
halving(C, L, Ls0, Ls).
halving(<, _, Ls, Ls).
halving(=, _, Ls, Ls).
halving(>, L, Ls0, Ls) :-
Half #= L // 2,
length(Lefts0, Half),
append(Lefts0, Rights0, Ls0),
mergesort(Lefts0, Lefts),
mergesort(Rights0, Rights),
merge(Lefts, Rights, Ls).
merge/3 is a built-in in several Prolog systems, and if your system does not provide it, you can easily implement it in less than 10 lines of Prolog code.Both quicksort and merge sort Prolog implementations are (even in their respective worst cases) vastly faster than the sorting algorithm given in the article (in its worst case, and also in its average case), and are at most marginally more complex to implement. For this reason, I think the article could be improved by simply showing one of these algorithms instead.
We cannot consider it a drawback of Prolog that computing all permutations takes O(n!) time in this language. It's the same for C, for example. If someone wants faster sorting, we can implement it also in a declarative language!
Somewhat strangely, the fact that you can easily write a Prolog program to generate all N! permutations of a list with N elements is more likely to be counted as a disadvantage than as an advantage, in my experience. If people looked at C and Prolog a bit more fairly in such overview documents, they could as well count it as a disadvantage of C that such a program is harder to write in C than in Prolog. Yet, I have never seen a C program implementing permutation sort in such articles, and never seen a comparable argument or suggestion about C as I routinely see it about Prolog.
Permutation sort is slow in C! Well, that's a well-known shortcoming of an imperative language such as C.
I think two of the OP's notes are covered in Mozart/Oz - concurrent programming using data floor variables (roughly equivalent to promises) and logic programming, including finite domain constraints.
For finite domain constraints, such guarantees are a major attraction in the Prolog community: If setting up the constraints terminates, then setting up the constraints plus search also terminates. That's a rather valuable and strong guarantee (since it is the search that can take take infeasibly long more often than setting up the constraints), and it breaks down if the search itself is more generally programmable.
Now, to your point, this is really the only way you can solve sudoku. There is an odd belief that you can make an algorithm that "never guesses." And folks think that that would be the definition of a non-brute force algorithm. In reality, you either have a solver that is ready to backtrack, or you have one that can't solve all puzzles.
Of course that does not hold in general, but in the special case of Sudokus and problem modelling by a domain-consistent all-different propagator, no search is required for a solution.
[1] A filtering algorithm for constraints of difference in CSPs, J-C. Régin, 1994
| ?- sudoku(X, Y).
This is called the most general query, since all arguments are fresh variables. Declaratively, we are asking Prolog: "Are there any solutions whatsoever?" In this case, the system answers with: X = [_#3(1..4),_#24(1..4),...etc.]
Y = [_#3(1..4),_#24(1..4),...etc.]
This shows that the Prolog program did not perform any search at all: No concrete value is instantiated, and the system does not ask for alternatives. That's right! No search whatsoever, and no backtracking at all, is performed in this program. No matter which definition of brute force search you are applying, this definitely does not fall into "search" at all!In the article, a more concrete query is also shown, as well as its solution. This solution is found via constraint propagation alone. This means that the system has deduced the unique solution purely by reasoning about the available constraints, which it does automatically every time a constraint is posted. This is one example of an algorithm that does not guess.
This also shows that even supposing you cannot make an algorithm that "never guesses", you can definitely make an algorithm that has to do much less guessing than searching over the whole tree. In Prolog, such algorithms are implemented and encapsulated in powerful combinatorial constraints like all_distinct/1, which are available in many Prolog systems and which you can use in your applications. Internally, the associated constraint propagators prune the search tree by applying various algorithms for you behind the scene. The ability to use such constraints and their strong pruning are major attractions of using Prolog for combinatorial tasks.
You are right that there may be cases where such propagation, albeit quite strong, may not suffice to fully solve a concrete combinatorial task. For this reason, you have to apply a concrete enumeration of remaining variables. In Constraint Logic Programming over Integers, this search is called labeling and provided by predicates like fd_labeling/2 or similar, depending on your Prolog system. The article does not use them though, and even if it did, the search could be guided by various heuristics by simply supplying a few options, which together with the pruning applied by constraints distinguish such a search from more uninformed search strategies.
Note also that the posted solution is quite hard to generalize elegantly. A shorter and equivalent Prolog program that describes 4x4 Sudoku puzzles is:
sudoku(Rows) :-
Rows = [Row1,Row2,Row3,Row4],
maplist(same_length(Rows), Rows),
blocks(Row1, Row2), blocks(Row3, Row4),
append(Rows, Vs),
fd_domain(Vs, 1, 4),
maplist(fd_all_different, Rows),
transpose(Rows, Cols),
maplist(fd_all_different, Cols).
blocks([], []).
blocks([A1,A2|As], [B1,B2|Bs]) :-
fd_all_different([A1,A2,B1,B2]),
blocks(As, Bs).
This assumes the availability of append/2 and transpose/2, which are likely already provided by your Prolog system, and easy to implement if they aren't.Isn't this why sudoku is still considered NP hard? (unless there has been a recent breakthrough).
Might try making a toy language out of it eventually.
Forth is known by name and reputation but very few people have ever tried to write anything in it - let alone delve into it's strange bootstrapped nature.
Incidentally, Aurora, mentioned in the article, evolved into Eve, which is inspired heavily by Datalog.
As someone who loves formal methods and believes most mainstream software systems today are mostly interactive and/or synchronous, I think this paradigm has a lot of untapped potential, and I'm glad to see it slowly move out of safety-critical systems into the mainstream, in languages like Eve.
[1]: https://en.wikipedia.org/wiki/Synchronous_programming_langua...
[2]: https://en.wikipedia.org/wiki/Esterel
[4]: http://witheve.com/
[5]: https://www2.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-...
Hardware is also concurrent by default. Coding some hardware logic will also change the way you approach coding. Anyone who's interested get an FPGA demo board and write some verilog or VHDL - I highly recommend it.
1. https://en.wikipedia.org/wiki/Hardware_description_language
Of course, always remember that "can be compiled" != "can be synthesized".
square = code(dup, op("*"))
Add a bit of syntactic sugar and you'll get "concatenative", or, better stack combinator-oriented language.
As for Prolog, I really appreciate that the author is calling it "declarative" not a "logic" language. It's more important to learn about backtracking and unification (powerful variant of pattern matching) than about something like Horn clauses. For anyone who wants to learn Prolog better (and it's worth it, Prolog is one of most beautiful PLs around!) I recommend this article: http://www.amzi.com/articles/prolog_under_the_hood.htm
[1] https://arxiv.org/abs/1501.00720 [2] https://www.researchgate.net/profile/Alexandr_Savinov
Disclaimer: I am the author of concept-oriented programming and data model
What I see similar in object-capability model and COP is that they both try to treat object access as a highly important part of a system behavior. In COP, I used to write that it is more important what happens during access rather than in the object itself. Hence, being able to make these intermediate actions integral part of the program (and develop good PL for that purpose) is very important.
https://youtu.be/VZQoAKJPbh8 very good talk about the background of eve. When he finally talks about eve you might want to switch to a more recent talk about it.
There's going to be a lot of really exciting stuff coming over the next few months. We've gathered a set of ideas, evidence, and implementations that have certainly blown us away - we hope others will find it valuable too.
EDIT: I realized I didn't address your initial question. Fwiw, we just recently crossed a really big milestone in terms of usage - more than 40,000 people have now played around with Eve on http://play.witheve.com and we've learned a ton from that experience. Part of the website revamp will be making that workflow simpler and nicer. We have a surprisingly high conversion rate (> 30%), so hopefully we can help smooth out that experience and begin to grow the community more and more.
[1]: http://witheve.com
Forth is a great concatenative language, since its the pioneer in that area (I think), but Factor is definitely worth mentioning too as a "modern" take on the paradigm. It essentially tries to be a concatenative Lisp.
ANI was dead even in 2014 when this article was written (which the author acknowledges: "the language seems dead, but the concepts are pretty interesting"). It has some really interesting ideas, but since it never got implemented, I'm not sure how much use there is in discussing it here amongst real languages. It would be useful as a discussion for possible future languages for sure, but its currently still just a concept, so I'm not sure what practical thing you can learn from it right now.
If every language has its own specific dark patterns and bottlenecks, LabVIEW's is definitely the "brightly-colored spaghetti" structural breakdown of an advanced beginner's code :-)
Incidentally, why do we, as programmers, tend to focus on a language's bottlenecks so much, in such an emotionally charged way? Any psychology-of-programming people out here? You might consider LabVIEW an excellent case study in getting on people's nerves...
ATS is aimed at system programming, and if you think Idris has a steep learning curve, you'll need belaying for ATS. And, the language is horrible. But it's really mind expanding to see it in action.
Dafny is a really basic language with support for Hoare/Dijkstra verification. It's completely unlike the type system model.
Stateflow or QP/QM, all other systems suck.
And what are the drawbacks? Why aren't everybody using it?
Drawbacks: it's very un-agile; you really have to think the system through completely. (The magic being that if you do this, it is very likely correct by design.) It's not really feasible to specify a part of the system now and leave other parts open for later refinement. The other drawback being that no good non-commercial options exist.
The section on "symbolic programming" has me pondering still about potential implications. It makes me imagine something like a "visual" WYSIWYG editor, but a "conceptual" editor.. Looking forward to digging deeper via provided references.
There are some production-ready schedulers for GPP, like Intel's TBB[1] (C++), but learning to be effective with this requires a major shift in thinking about code - essentially thinking in graphs.
[1] - https://www.threadingbuildingblocks.org/tutorial-intel-tbb-f...
[1] https://github.com/janusassetallocation/loman [2] http://loman.readthedocs.io/en/latest/user/intro.html
[1] - https://software.intel.com/en-us/blogs/2011/01/10/using-the-...
[2] - https://software.intel.com/en-us/blogs/2011/09/13/using-inte...
https://youtu.be/275FQ9koAw8?t=7647
Very interesting view, especially considering how long ago that was and how relevant they still are.
[1] They were also the creators of JGL, the Java Generics Library, which was like a Java version of the C++ STL, and done before Java got generics natively.
> Dependent types
>
> Example languages: Idris, Agda, Coq
>
> You’re probably used to type systems in languages like C and Java,
> where the compiler can check that a variable is an integer, list, or string.
> But what if your compiler could check that a variable is “a positive integer”,
> “a list of length 2”, or “a string that is a palindrome”?
This is what I love about SQL. You can even define your own types, like "email", at least in PostgreSQL: create table contacts (
name text not null,
age int check (age >= 0),
email email
);
It already has a few of these special types built in, like for IP and MAC addresses (https://www.postgresql.org/docs/current/static/datatype.html).Thus, a dependently typed database would allow one columns type to depend on the value of another column.
It would be like saying
Create table dep ( name text, fieldType int not null, fieldValue (if fieldType = 1 then int else varchar))
Notice the 'if' in the type of fieldValueTo help offset the irony and celebrate Turing completeness, here is a PostScript interpreter implemented in PostScript, and an explanation of why.
http://www.donhopkins.com/home/archive/cyber/litecyber/ps.ps...
http://www.donhopkins.com/home/archive/cyber/litecyber/ps.ps...
It reminds me of a statement I read about managing application state, to treat the state (in this case a Redux store) as an "in-memory database". Add a layer to load/persist automatically - via LocalStorage, WebSocket, etc. - and it would be persistent by default. I suppose you wouldn't want everything persistent though, just a relevant slice of state.
Here's an article about "persistent languages", which includes discussion on related features. http://wiki.c2.com/?PersistentLanguage
First, declarative programming is a generic name which includes a broad range of paradigms - from functional to logic programming. Logic programming is something that deserves a special mention and discussion, because there are a number of interesting and unique concepts that deserve a more in-depth explanation.
Second, "dependent types" is better understood as a feature of a language (or better yet, of a type system) than a paradigm by itself.
Some of the other "paradigms" also seem more like characteristics of languages, and not really something that structures the way solutions are expressed/understood.
Edit: to put a slightly finer point on it, this irreducible foundation mostly has to do with how the language "computes." Imperative languages compute via statements that modify program state. Functional languages compute via proof reduction to normal form. Logic languages compute via proof search.
https://esolangs.org/wiki/Rail https://esolangs.org/wiki/Billiard_ball_machine
# synchronous
sleep 1 ;
sleep 2 ;
# parallel
sleep 1 &
sleep 2 &
wait
wait
The results have to be files... in shell you think of the file system as your "variables", so everything is somewhat global.Would you build something like this from scratch or base it on another tech stack like the jvm?
Declarative is a vague term, but for me it means mostly a way to compactly describe your problem in domain-specific terms. There are some languages in which you can produce compact code, like APL or J. But declarative for me means readable too. And there are cases, when in Prolog one can write a more compact and readable code than in Haskell. In Prolog you can describe just the essence of the problem. Not always, of course.
Another interesting thing about Prolog that it's a small language. It means that it has only few internal parts that make it alive (The complexity of many Prolog implementations is a result of fighting for perfomance). I really like small languages (like Oberon or Forth, for example), because it's possible to learn how they work internally. And the knowledge of this inner working helps to understand the language better. There is nothing "logical" inside Prolog, just a few powerful imperative constructs.
The author of "Prolog Under the Hood An Honest Look" says:
"Prolog, billed as "logic programming", is not really. You may be disappointed if that's what you expected to find. On the other hand, having backtracking, unification, and recursion inside one computer language leads to something very powerful and special."
And I, personally, like Prolog terms very much!
It is possible to craft a sudoku that has a unique solution without having any forced moves at the beginning. Turns out most are not actually hard, but it is doable. When I get back near my book at home, I an post an example, if folks are interested.
> Most people only [...] deal with SQL on a very shallow level and
> barely regard it as programming at all.
Fixed that for you.And now the sentence states why there are so many problems in the world (of software).
;)
Is there any work or ideas on how to solve that issue?
And so it's also hard to add features later, in next versions ?
https://en.wikipedia.org/wiki/Stateflow
https://en.wikipedia.org/wiki/Simulink
Recent example from high-assurance security:
https://www.umsec.umn.edu/sites/www.umsec.umn.edu/files/hard...
Eve has a similar philosophy to Red/Rebol - that programming is needlessly complex, and by fixing the model we can make the whole ordeal a lot nicer. We start with a uniform data model - everything in Eve is represented by records (key value pairs attached to a unique ID). This keeps the language small, both implementation-wise and in terms of the number of operators you need to learn.
Programs in Eve are made up of small blocks of code that compose automatically. In each block you query records and then specify transforms on those records. Blocks are short and declarative, and are reactive to changes in data so you don't worry about callbacks, caching, routing, etc.
Due to this design, we've reduced the error footprint of the language -- there are really only 3 kinds of errors you'll ever encounter, and those mostly relate to data missing or being in the wrong shape that you expect. What's more, we'll actually be able to catch most errors with the right tooling. You'll never experience your app crashing or errors related to undefined/nil values.
We've made sure your program is transparent and inspectable. If you want to monitor a value in the system, you can just write a query that displays it, as the program is running. I like to think of this as a "multimeter for code". You can do this for variables, memory, HTTP requests, the code itself ... since everything is represented as records, everything is inspectable.
Because at its core Eve is a database, we also store all the information necessary to track the provenance of values. This is something most languages can't do, because the information just isn't there. So for instance if a value is wrong, you can find out exactly how it was generated.
There's a lot more work to do, but we have big plans going forward. We plan to make the runtime distributed and topology agnostic, so you can potentially write a program that scales to millions of users without having to worry about provisioning servers or worrying about how your code scales.
We're also planning on multiple interfaces to the runtime. Right now we have a text syntax, but there's no reason we couldn't plug in a UI specific editor that integrates with text code. We've explored something like this already.
Anyway, those are future directions, but what we have right now is a set of semantics that allow for the expression of programs without a lot of the boilerplate and hassle you have to go through in traditional languages, and which provide some unique properties you won't find in any other language (at least none that I've used).
The code examples demonstrate surprising simplicity in achieving features that would be complicated to implement in other languages. I'm convinced that Eve will influence how programmers think (at least it did for me) and promote development of languages/frameworks/libraries that adopt some of these ideas. Great work, will be following with interest.
Quick question: Will it always be online only?
https://en.wikipedia.org/wiki/Esterel
(if you'd like to throw a link in your comment, would be happy to delete this one)
I came in contact with the language during my studies (one of my profs worked on it), found it interesting, but haven't really had a chance to apply it to anything I've done professionally since. It's a very neat language though, and if I got to do anything in the embedded field.
I notice that there is now an open source compiler, which is great. The only implementation I was aware of so far was closed source.
So, I'm ultimately wary of the claims here. The system has to be doing a search. It may be crafted in such a way that it just walks straight to the solution for some specifications, but that is as much from luck of easy constraints as it is the algorithm.
For this question, I can really only refer to Knuth. This is exercise 515 of 7.2.2.2. I likely just messed you up on the wording. There are no forced moves for open squares. As soon as you pick one, though, it may force all others.
> This solution is found via constraint propagation alone. This means that the system has deduced the unique solution purely by reasoning about the available constraints, which it does automatically every time a constraint is posted. This is one example of an algorithm that does not guess.
Have you considered how that constraint propagation algorithm works? What specific algorithm are you referring to? Constraint Programming is an alternative formulation of the same problem that Integer Linear Programming tackles, and the latter is known to be NP-hard.
https://en.wikipedia.org/wiki/Linear_programming#Integer_unk...
Constraint Programming and Integer Programming are both characterized as examples of Tree Search:
http://preserve.lehigh.edu/cgi/viewcontent.cgi?article=1780&...
> Since tree search is a basic solution technique employed in both constraint and integer programming, we begin with a generic overview oftree search as a technique for finding feasible solutions to mathematical models. [...]
> Every tree search algorithm is defined by four elements: a node- processing procedure, pruning rules, branching rules, and search strategy. The processing step is applied at every node of the search tree beginning with the root node, and usually involves some attempt to further reduce the feasible set associated with the node by applying logical rules.
> Overall a tree search algorithm works as follows. A list of nodes that are candidates to be processed is maintained throughout the algorithm. This list initially contains only the root node. At each step in the algorithm, a node is chosen and processed. If the processing results in pruning of the node by one of the pruning rules, then the node is discarded. Otherwise, the node is further divided, resulting in two or more children, which are then added to the list of candidates. This procedure is iterated until no nodes remain on the candidate list.
> [...] the constraint program is made up of two pieces: the constraints that comprise the formulation and a search strategy for solving the problem (emphasis mine). This is in contrast to a mathematical program, which describes a model but does not specify how to solve it.
> Traditionally, a constraint programming formulation consists only of a list of decision variables, domains, a set of constraints to be satisfied, and a search procedure (emphasis mine). This traditional definition refers to a constraint satisfaction problem (CSP), since a solution is any assignment of values to variables such that all constraints are satisfied.
Can you clarify which algorithm you're referring to, and how it manages to solve the problem without employing Tree Search?
EDIT: https://kx.com/download/
"involves some attempt to further reduce the feasible set associated with the node by applying logical rules"
These propagation rules are among the major attractions of CP, and distinguish the paradigm from uninformed search strategies that have to consider much larger portions of the search space in general. I think a description of CP should put at least equal emphasis on this pruning mechanism as on the actual search itself, because sufficiently strong propagation may render the search completely unnecessary. In fact, this is exactly what happens in the Sudoku example of the article!
In a sibling thread, sfrank has cited a very important reference as an example of such a propagation algorithm, which serves to illustrate this idea and is also widely used in practice in CP systems and their applications:
A filtering algorithm for constraints of difference in CSPs, J-C. Régin, 1994
To see this algorithm in action, consider the following example: Take three variables X, Y and Z, all of them integers that are either 0 or 1, with the additional constraint that they must be pairwise distinct. In Prolog with CLP, and said algorithm available under the name all_distinct/1, we can state this task as follows:
?- X in 0..1, Y in 0..1, Z in 0..1,
all_distinct([X,Y,Z]).
In response to this query, the system says: false.
This means the system has deduced that there are no solutions. Note that this result is obtained without applying any search! For comparison, suppose we naively post pairwise disequalities instead: ?- X in 0..1, Y in 0..1, Z in 0..1,
X #\= Y, Y #\= Z, X #\= Z.
In response, the system now answers: X in 0..1,
X#\=Z,
X#\=Y,
Z in 0..1,
Y#\=Z,
Y in 0..1.
This means that there could potentially be solutions. The system does not know whether there are any, so it shows us remaining constraints that must hold for any solution. Now, an additional search makes clear that there are none: ?- X in 0..1, Y in 0..1, Z in 0..1,
X #\= Y, Y #\= Z, X #\= Z,
label([X,Y,Z]).
Here, I have simply added the goal label/1, which is the explicit enumeration and thus the search I have mentioned in an earlier post. In response, we now again of course get: false.
As another example of propagation, please consider: ?- X in 0..1, Y in 0..1, Z in 0..2,
all_distinct([X,Y,Z]).
Z = 2,
X in 0..1,
all_distinct([X, Y, 2]),
Y in 0..1.
Here, the system has deduced that Z is necessarily 2, again without any search. For the other variables, both values are still admissible, and to obtain concrete solutions, we must again label them explicitly. However, we do know that there is a solution in this case, due to the strength of all_distinct/1, which internally uses a complex reasoning about the value graph of the CSP that is somewhat anticlimactically encapsulated in such a superficially trivial predicate, but propagates much more strongly than simply stating pairwise disequalities would.This shows that you can arrive at a solution (or prove the lack of a solution) either via a search or by using a sufficiently strong propagation mechanism. Doing more of one requires less work of the other. Please note that, as I said before, you indeed need an explicit search in general, because the propagation algorithms alone are, while often quite effective, not effective enough in general to guarantee consistency when different constraints interact, even if they guarantee domain consistency individually. This is a practical trade-off between propagation strength and efficiency of the available constraints. You also need search to enumerate all solutions, if there are more than one. However, note that propagators are also triggered during the search, and so unifying even a single remaining variable with a concrete integer may again lead to a situation where no additional search is necessary. Thus, the initial size of the search space is not a satisfactory measure for the eventual complexity of the search, because it does not account for the additional propagation that is applied during the search itself.
Even in the case of Sudoku, and even if you use the most powerful individual all_distinct/1 constraints, you need to search, in general, to truly generate the unique solution explicitly. But the Prolog code shown in the article doesn't (hence, it will lead to floundering constraints in general), and the concrete puzzle shown in the article is, coincidentally or not, correctly solved just by such propagation algorithms, without any search. This situation is also not particularly rare: In many cases of practical importance, constraint propagation alone already tightens remaining domains so significantly that no or only very little additional search is necessary.
Thus, everything you said is true: Yes, you need search in general, also when using CP, to obtain concrete solutions. However, to repeat, it is quite different from what one would call "brute force" search because the sophisticated and often very effective propagation algorithms prune the search space, often substantially, and can moreover help to guide the search with various heuristics. In practice, you typically buy a Prolog system just to benefit from these propagation algorithms! Various notions of consistency exist, and you can find more such algorithms in the literature references that are for example included in SICStus Prolog and other Prolog systems with constraints.
Second, note that the description you quote does not even cover the particular case that arises in the article and also many other cases in practice: That is, there are no nodes at all, because everything is already settled by constraint propagation before any search even starts, making the whole solution explicitly available without search.
The Eight Queens problem you cite is quite different from Sudoku, because it may have multiple solutions. For this reason, as you correctly observe, search is needed to generate the different solutions in such cases. But this does not invalidate the Sudoku example, where we know that exactly one solution exists, and sufficiently strong pruning could always determine it, whether by internal propagation or other algorithms that simply eliminate all values that do not participate in the unique solution of the puzzle. Nor does it show that the cost for the search outweights that of constraint propagation in either case.
For these reasons, I recommend to put at least equal focus on constraint propagation as on the search when discussing CP.
Do you happen to have a recommended link on how this can be accomplished? First few results in searching just show farming this out to a specialized function in matlab. :)
There is absolutely no "guessing" involved, though: you describe your sudoku problem as a series of linear equations (which represent the edges of your search space) and then the algorithm travels from one vertex to the next until it finds the solution. There is no backtracking at all.
> If all of the unknown variables are required to be integers, then the problem is called an integer programming (IP) or integer linear programming (ILP) problem. In contrast to linear programming, which can be solved efficiently in the worst case, integer programming problems are in many practical situations (those with bounded variables) NP-hard. 0-1 integer programming or binary integer programming (BIP) is the special case of integer programming where variables are required to be 0 or 1 (rather than arbitrary integers). This problem is also classified as NP-hard, and in fact the decision version was one of Karp's 21 NP-complete problems.
Here is an article on the topic of "Binary and Mixed Integer Programming" which explains part of the Balas Additive Algorithm (infeasibility pruning, I believe) in terms of backtracking:
> At this point, both of the nodes just created are not eligible for further expansion, so we back up the tree, looking for a level at which one of the nodes is unexplored.
http://www.sce.carleton.ca/faculty/chinneck/po/Chapter13.pdf
Another paper describes a method by Glover characterized as "A backtracking procedure for implicit enumeration".
> A particularization of the procedure based on Balas' algorithm. In S2 we presented and justified a flexible and economical back-tracking procedure for finding an optimal feasible solution of (P) by implicit enumeration.
See Figure 1 on page 182 (or page 6 in the PDF):
http://www.anderson.ucla.edu/faculty/art.geoffrion/home/docs...
This branch of mathematics is not my forte however, so if I've misunderstood then I'd appreciate clarification. It seems like the algorithm is not backtracking in the sense of generating possible solutions and checking them, but is backtracking in the sense of fathoming which next cheapest (partial) solution might be feasible, and abandoning it if proven to be definitely infeasible, in favor of examining the (then) next cheapest potential solution.
Finally, see the following paper that compares and contrasts Constraint Programming with Integer Programming, and characterizes both of them as instances of Tree Search:
http://preserve.lehigh.edu/cgi/viewcontent.cgi?article=1780&...
First of all: thanks everyone for forcing me to go back and re-visit some stuff about LP/IP - I think I was wrong - some sort of search tree/branch and bound is not just a speedup trick - it looks like it is the only way to actually get a correct solution for IP problems.
So let me amend my two previous statements:
Linear Programming does not require any search tree or backtracking is CORRECT.
Linear Programming can solve Sudoku is *WRONG* - for Sudoku you need Integer Programming, and that requires an approach that involves search trees/backtracking.
Sorry for the confusion, and thanks again for making me recheck my assumptions.I should also have stressed harder that I agreed with your definition of brute force. I was just offering an explanation for where I think the post went awry.
knuth(a, [[1,_,_,_,_,6,_,8,_],
[5,_,8,7,2,1,4,_,6],
[_,6,_,3,8,_,2,_,1],
[8,4,_,_,_,3,_,_,5],
[_,_,5,_,6,_,8,_,_],
[6,_,_,8,_,_,_,4,2],
[3,_,6,_,4,8,_,2,_],
[4,_,7,6,3,2,1,_,8],
[_,8,_,5,_,_,_,_,4]]).
Consider now a Prolog solution for 9x9 (i.e., "normal") Sudoku puzzles, using integer constraints: sudoku(Rows) :-
length(Rows, 9), maplist(same_length(Rows), Rows),
append(Rows, Vs), Vs ins 1..9,
maplist(all_distinct, Rows),
transpose(Rows, Columns), maplist(all_distinct, Columns),
Rows = [As,Bs,Cs,Ds,Es,Fs,Gs,Hs,Is],
blocks(As, Bs, Cs), blocks(Ds, Es, Fs), blocks(Gs, Hs, Is).
blocks([], [], []).
blocks([N1,N2,N3|Ns1], [N4,N5,N6|Ns2], [N7,N8,N9|Ns3]) :-
all_distinct([N1,N2,N3,N4,N5,N6,N7,N8,N9]),
blocks(Ns1, Ns2, Ns3).
This is indeed exactly a case where search is necessary, because the pruning applied by this concrete constraint solver is not strong enough to deduce the unique solution. If we only post the constraints, we get: ?- knuth(a, Rows),
sudoku(Rows),
maplist(portray_clause, Rows).
[1, _, _, _, _, 6, _, 8, _].
[5, _, 8, 7, 2, 1, 4, _, 6].
[_, 6, _, 3, 8, _, 2, _, 1].
[8, 4, _, _, _, 3, _, _, 5].
[_, _, 5, _, 6, _, 8, _, _].
[6, _, _, 8, _, _, _, 4, 2].
[3, _, 6, _, 4, 8, _, 2, _].
[4, _, 7, 6, 3, 2, 1, _, 8].
[_, 8, _, 5, _, _, _, _, 4].
However, it is exactly as you say: Even assigning a single variable a concrete value automatically enforces the solution by constraint propagation alone: ?- knuth(a, Rows),
sudoku(Rows),
Rows = [[_,Var|_]|_],
indomain(Var),
maplist(portray_clause, Rows).
[1, 2, 3, 4, 5, 6, 7, 8, 9].
[5, 9, 8, 7, 2, 1, 4, 3, 6].
[7, 6, 4, 3, 8, 9, 2, 5, 1].
[8, 4, 2, 1, 9, 3, 6, 7, 5].
[9, 7, 5, 2, 6, 4, 8, 1, 3].
[6, 3, 1, 8, 7, 5, 9, 4, 2].
[3, 1, 6, 9, 4, 8, 5, 2, 7].
[4, 5, 7, 6, 3, 2, 1, 9, 8].
[2, 8, 9, 5, 1, 7, 3, 6, 4].
Note though the general point: A sufficiently strong constraint solver could have deduced the unique solution even without trying any concrete value, exclusively by reasoning about the posted constraints with sufficient sophistication! This concrete solver can't, due to the particular trade-off between efficiency and pruning strength that was chosen by the implementor, and so search must settle the remaining part.One of the other puzzles is:
knuth(d, [[1,_,3,_,5,6,_,8,9],
[6,8,_,3,_,9,1,_,5],
[_,9,5,1,8,_,6,3,_],
[3,_,8,9,6,_,_,5,1],
[_,1,9,5,_,8,3,6,_],
[5,6,_,_,3,1,9,_,8],
[_,5,6,_,9,3,8,1,_],
[8,_,1,6,_,5,_,9,3],
[9,3,_,8,1,_,5,_,6]]).
In this case, the solution is not unique. We can use the constraint solver to count the number of admissible solutions, again using search: ?- knuth(d, Rows),
sudoku(Rows),
findall(., maplist(labeling([ff]), Rows), Ls),
length(Ls, L).
Yielding: L = 12.This indeed shows that search is necessary in general also when applying constraint programming. Still, no such search is applied in the posted article. In the example contained in the article, the constraints alone suffice to determine the unique solution. I leave it as an exercise for the reader to determine whether this is the case for all 4x4 Sudoku puzzles.
I'm curious on how you claim a strong enough solver could have solved this without a "guess."
Also, any thoughts on how treating them as constraint propagation compares to casting it as an exact cover problem?
As an alternative, we can just as well model Sudoku puzzles as combinatorial tasks over Boolean variables, and hence use CLP(B), constraint logic programming over Booleans.
For example:
sudoku(Rows) :-
length(Rows, 9), maplist(same_length(Rows), Rows),
maplist(row_booleans, Rows, BRows),
maplist(booleans_distinct, BRows),
transpose(BRows, BColumns),
maplist(booleans_distinct, BColumns),
BRows = [As,Bs,Cs,Ds,Es,Fs,Gs,Hs,Is],
blocks(As, Bs, Cs), blocks(Ds, Es, Fs), blocks(Gs, Hs, Is).
blocks([], [], []).
blocks([N1,N2,N3|Ns1], [N4,N5,N6|Ns2], [N7,N8,N9|Ns3]) :-
booleans_distinct([N1,N2,N3,N4,N5,N6,N7,N8,N9]),
blocks(Ns1, Ns2, Ns3).
booleans_distinct(Bs) :-
transpose(Bs, Ts),
maplist(card1, Ts).
card1(Bs) :- sat(card([1],Bs)).
row_booleans(Row, Bs) :-
same_length(Row, Bs),
maplist(cell_boolean, Row, Bs).
cell_boolean(Num, Bs) :-
length(Bs, 9),
sat(card([1],Bs)),
element(Num, Bs, 1).
In this formulation, I use 9 Boolean variables to represent which of the 9 possible integers is assigned to a particular cell.Let us apply this formulation to puzzle (b) that is shown in the solution of this exercise:
knuth(b, [[1,_,3,_,5,6,_,8,9],
[5,9,7,3,8,_,6,1,_],
[6,8,_,1,_,9,3,_,5],
[9,5,6,_,3,1,8,_,7],
[_,3,1,5,_,8,9,6,_],
[2,_,8,9,6,_,1,5,3],
[8,_,9,6,_,5,_,3,1],
[_,6,5,_,1,3,2,9,8],
[3,1,_,8,9,_,5,_,6]]).
Here is the result: ?- knuth(b, Rows),
sudoku(Rows),
maplist(portray_clause, Rows).
[1, 2, 3, 4, 5, 6, 7, 8, 9].
[5, 9, 7, 3, 8, 2, 6, 1, 4].
[6, 8, 4, 1, 7, 9, 3, 2, 5].
[9, 5, 6, 2, 3, 1, 8, 4, 7].
[7, 3, 1, 5, 4, 8, 9, 6, 2].
[2, 4, 8, 9, 6, 7, 1, 5, 3].
[8, 7, 9, 6, 2, 5, 4, 3, 1].
[4, 6, 5, 7, 1, 3, 2, 9, 8].
[3, 1, 2, 8, 9, 4, 5, 7, 6].
For comparison, here is the result of the CLP(FD)-based formulation that I posted earlier: ?- knuth(b, Rows),
sudoku(Rows),
maplist(portray_clause, Rows).
[1, _, 3, _, 5, 6, _, 8, 9].
[5, 9, 7, 3, 8, _, 6, 1, _].
[6, 8, _, 1, _, 9, 3, _, 5].
[9, 5, 6, _, 3, 1, 8, _, 7].
[_, 3, 1, 5, _, 8, 9, 6, _].
[2, _, 8, 9, 6, _, 1, 5, 3].
[8, _, 9, 6, _, 5, _, 3, 1].
[_, 6, 5, _, 1, 3, 2, 9, 8].
[3, 1, _, 8, 9, _, 5, _, 6].
This shows that CLP(B), in contrast to CLP(FD), has determined the unique solution without any search. This is an example of a solver that propagates as strongly as possible, and hence achieves global consistency even across different constraints.It also works for puzzle (a):
?- knuth(a, Rows),
sudoku(Rows),
maplist(portray_clause, Rows).
[1, 2, 3, 4, 5, 6, 7, 8, 9].
[5, 9, 8, 7, 2, 1, 4, 3, 6].
[7, 6, 4, 3, 8, 9, 2, 5, 1].
[8, 4, 2, 1, 9, 3, 6, 7, 5].
[9, 7, 5, 2, 6, 4, 8, 1, 3].
[6, 3, 1, 8, 7, 5, 9, 4, 2].
[3, 1, 6, 9, 4, 8, 5, 2, 7].
[4, 5, 7, 6, 3, 2, 1, 9, 8].
[2, 8, 9, 5, 1, 7, 3, 6, 4].
At this point, you may wonder: Why have I started with (b) instead of (a)? The reason is easy to explain: For puzzle (a), the strong propagation of the CLP(B)-based formulation took so long that it only completed now, after I have finished typing this, a few minutes after I had posted the query, whereas it only took a few seconds for example (b).Both examples have taken considerably longer to solve with CLP(B) instead of CLP(FD). That's not to say that one approach is better than the other in general or even in that particular case - instead, it is more an observation about the solver implementations and the chosen trade-offs between propagation strength and efficiency in both solvers.
CLP(B) achieves this strong consistency by internally considering, in a sense, the totality of all possible assignments, and this may require exponential space and hence also time. But for many important practical cases, the representation used by CLP(B) is very efficient, and indeed allows us to solve tough problems efficiently. Note that still, no search is involved here: Even the internal propagation of CLP(B) proceeds completely deterministically, and at no point in the computation do we have to guess anything.
It must also be said that the CLP(FD)-based formulation, in addition to being faster (despite the necessary search), is also shorter and arguably more natural than the Boolean variant. Casting this as an exact cover or hitting set task can definitely be done, either with CLP(B) or CLP(FD) constraints (since Booleans can also be considered as a special case of integers). However, it takes us even further away from the quite direct and readable "natural" CLP(FD) formulation. For cover problems in general, I recommend you obtain a CLP(B) solver that internally uses Zero Suppressed Binary Decision Diagrams (ZDDs). They let you efficiently represent cases where many of the variables (such as row selection indicators) are zero, which is typical for covering tasks.
Try to imagine this: by manipulating an n-dimensional matrix in a certain way, you get, at each step, a result which is guaranteed to be part of your solution space. You also have a way to find out, at each step, if you have found the optimal solution (in terms of your objective function) or if no solution exists.
So the process goes like this: Put your constraints in the matrix.
Loop -
Manipulate Matrix
Is this the optimal Solution?
Yes - return solution
Is the solution space empty/concave/unbound?
Yes - return error
End.
No backtracking in the sense of "try this... hmm... no, try this other".I have used LP professionally in the past, and recently participated in the MOOC I linked above (as a refresher) but I might surely be missing something (the theory I studied at UNI too many years ago - now I just use it as a tool) or overgeneralizing too much. If anyone can provide corrections these will be welcomed.
http://preserve.lehigh.edu/cgi/viewcontent.cgi?article=1780&...
> Since tree search is a basic solution technique employed in both constraint and integer programming, we begin with a generic overview of tree search as a technique for finding feasible solutions to mathematical models.
> Every tree search algorithm is defined by four elements: a node- processing procedure, pruning rules, branching rules, and search strategy. The processing step is applied at every node of the search tree beginning with the root node, and usually involves some attempt to further reduce the feasible set associated with the node by applying logical rules. [...]
> One particularly well-developed solution technique, called branch and bound, was introduced by Land and Doig. Branch and bound is frequently used to find solutions to integer programming problems; it is a tree search procedure in which a bound on the optimal value to each subproblem is obtained by solving a relaxation. In this thesis we assume the use of a linear programming relaxation in the branch and bound tree.
> The last piece of a branch and bound algorithm is a search strategy specifying the order in which to explore nodes of the tree. Depth-first search can be used to save memory in a branch and bound tree, since we must only remember the difference in solution between two successive nodes of the search tree.
Chapter nine of Applied Mathematical Programming from MIT classifies integer program solvers into three major groups:
http://web.mit.edu/15.053/www/AMP-Chapter-09.pdf
> Whereas the simplex method is effective for solving linear programs, there is no single technique for solving integer programs. Instead, a number of procedures have been developed, and the performance of any particular technique appears to be highly problem-dependent. Methods to date can be classified broadly as following one of three approaches:
> i) enumeration techniques, including the branch-and-bound procedure;
> ii) cutting-plane techniques; and
> iii) group-theoretic techniques. [...]
> Branch-and-bound is essentially a strategy of ‘‘divide and conquer.’’ The idea is to partition the feasible region into more manageable subdivisions and then, if required, to further partition the subdivisions. In general, there are a number of ways to divide the feasible region, and as a consequence there are a number of branch-and-bound algorithms.
> An integer linear program is a linear program further constrained by the integrality restrictions. Thus, in a maximization problem, the value of the objective function, at the linear-program optimum, will always be an upper bound on the optimal integer-programming objective. In addition, any integer feasible point is always a lower bound on the optimal linear-program objective value. The idea of branch-and-bound is to utilize these observations to systematically subdivide the linear programming feasible region and make assessments of the integer-programming problem based upon these subdivisions.
The book describes the "cutting-plane" approach, which does seem to work more like how you're describing (a series of transformations), but also says:
> In practice, the branch-and-bound procedures almost always outperform the cutting-plane algorithm. Nevertheless, the algorithm has been important to the evolution of integer programming. Historically, it was the first algorithm developed for integer programming that could be proved to converge in a finite number of steps. In addition, even though the algorithm generally is considered to be very inefficient, it has provided insights into integer programming that have led to other, more efficient, algorithms.
"In my mind" it works like this: https://www.quora.com/What-is-the-difference-between-integer...
Having said this, while NP is guaranteed to terminate the actual computation time may take ages. Integer Programming may very well leverage the fact that the solution is restricted to integer values and apply different, faster algorithm to exploit this property - including search trees. But in the most general sense you do not need a search tree or backtracking for Linear Programming.