> we figured out that facebooks Facebot crawler will crawl _every_ url that was recorded by their tracking pixel.
I would be more surprised to find out that they didn't crawl everything they can, specifically pages that invite them in.
> 1. they are crawling potentially sensitive information granted by links with tokens
If the page contains sensitive information you absolutely should not have code that you do not control (any code loaded from third party hosts, not just facebook's bits).
As a matter of security due diligence if you have third party hosted code linked into any such pages you should remove it with some urgency and carefully review the design decisions that lead to the situation. If you really must have the third party code in that area then you'll need to find a way of removing the need for the tokens being present.
Furthermore, if the information is sensitive to a particular user then your session management should not permit a request from facebook (or any other entity that has not correctly followed your authentication procedure) to see the content anyway.
> 2. they are triggering potentially harmful and/or confusing actions in your website by repeating links
Possibly true, but again that suggests a design flaw in the page in question. I assume that they are not sending POST or PUT requests? GET and HEAD requests should at very least be idempotent (so repeated calls are not a problem) and ideally lack any lasting side effect (with the exception of logging).
> 3. they are repeating requests in a broken way by not encoding url-parameters correctly
That does sound like a flaw, but one that your code should be immune to being broken by. Inputs should always be verified and action not taken unless they are valid. This is standard practise for good security and stability. The Internet is a public place, the public includes both deliberately nasty people and damagingly stupid ones so your code needs to take proper measures to not allow malformed inputs to cause problems.
You can't use "the page isn't normally linked from other sources so won't normally be found by a crawler" as a valid mitigation because the page could potentially be found by a malicious entity via URL fuzzing.
> 4. I could not find a warning or note on their tracking-pixel documentation that pages tracked would be crawled later
A warning would be nice, but again unless they explicitly say they won't do such things I would be surprised to find that they didn't not that they do.