The training datasets here also seem pretty small, by comparison? "Hundreds of closed source projects we own"?
It'd be interesting to see if it works well. This is an easy product to prove: just generate a bunch of CVEs from open source code.
† SAST is enterprise security dork code for "security linter"
It's very true. SAST is really enterprise security dork code for "security linter"! I might start using that with some of our developer facing content.
We launched a recent project that combines LLMs + Static code analysis to detect more sophisticated business and code logic findings to get more real stuff. We wanted to follow the industry a bit more to create familiarity but a differentiation too in this type and we called it BLAST (Business Logic Application Security Testing).
We're very proud of the work we recently did, and wanted to share it with the greater HN community. We'd love to hear your feedback and thoughts. Let me know if I can clarify anything in particular.
Shift left and modern development patterns can catch a very large amount of known vulns so in newer applications things become mostly about fixing newly discovered vulns and doing it in an active development cycle. It's the older code that's the real scary monster and identifying the vulns is the least scary part of the process to get them remediated and put into production.
Anything that reduces false positives is good, especially if it does so without also making a significant reduction in identified true positives, but none of that changes the fact that it is the low hanging fruit of the system.
On false positives, we introduced false positive detection using AI & static analysis because of the exact issue you're highlighting.
I wonder if this is an NSA front? Or Palintir maybe? Or NSO?
There are some free tools out there but most do lag behind the industry as a whole by quite a bit. There's also lots of abandoned free tools out there cluttering up the space. Plenty started with good intentions that now give a false sense of security. There's also lots of snake oil in the paid space. Doing one's homework really helps here and you'd be surprised how many tools fail miserably during a simple proof of concept test, which is probably why more and more vendors try to avoid them.
I simply do not understand why the SQL API even allows injection vulnerability. Adam Ruppe and Steven Schweighoffer have done excellent work in writing a shell API over it (in D) that makes such injections far more difficult to inadvertently write.
On airplanes, when a bad user interface leads to an accident, the user interface gets fixed. There's no reason to put up with this in programming languages, either.
Reliable = deterministic
Accurate? Not at all. Studies show that ~30% of findings are false positive. We've also seen that with the companies we work with because we built a false positive detection feature in Corgea. There's another ~60% of issues that are false negative. https://personal.utdallas.edu/~lxz144130/publications/icst20...
We combine static analysis + LLMs to do better detection, triaging and auto-fixing because static analysis alone is broken in many ways.
And of course, the danger of AI is much greater than just inequality: it is the further reduction of all human beings to cogs in a machine, and that is bad even if we all end up being relatively equal cogs.
Doing so, we've been able to capture a very wide range of vulnerabilities namely in web application vulnerabilities. We've done this across small projects to very large ones too.
How would one implement this?
"SQL APIs" use prepared statements. Meaning you have a string for SQL and some dynamic variables that inject into that string via $1, $2 etc.
BUT now if developer makes that string dynamic via a variable, then you have SQL injection again.
The low-level API could simply not allow SQL statements as strings, and instead provide separate functions to build the queries and statements.
It would provide entry points which could be used to ensure proper escaping and such, and would still allow for easily generating queries dynamically in the cases where that is needed.
Of course, it doesn't completely guard against Bobby Tables[1], one could imagine someone including a run-time code generator and feed it unprotected SQL as input.
But it should make it a lot more difficult, as it would be much more "unnatural", requiring going against the grain, to inject unprotected user data. Also, the "query_execute" function could raise an error if there's more than one statement, requiring one to use a different function for batch execution.
Pseudo-codish example off the top of my head, for the sake of illustration:
is_active = str_to_bool(args['active']); // from user
qry = new_query(ctx);
users_alias = new_table_alias(qry, 't');
query_select_column(users_alias, 'id');
query_select_column(users_alias, 'username');
query_from_table(users_alias, 'users');
filter_active = query_column_eq_clause(users_alias, 'active', is_active);
where = query_where(qry);
query_where_append(where, filter_active);
cursor = query_execute(qry);
[1]: https://xkcd.com/327/Go one level up.
For example statements that are prepared should not allow strings in the SQL, but rather variables, and then bind them to values like PDO does
I believe it largely is due to how SQL is designed to allow multiple queries to be concatenated with each other, and poor logic design when writing such queries.
The D programming language allows direct use of C printf. However, D checks the arguments against the format specifiers in the format string to make it memory safe.
The constant stream of bugs due to format/arguments is now history.
There is no reason why C and C++ compilers cannot do this, too.
Yes, indeed. The AI could be used to prefilter the list of warnings generated by static analysis to reduce the amount of false positives. To achieve that an AI could use the history of the projects static analysis results to find likely false positives. Or an I could propose a patch to avoid a warning. If it is automatically compiled, passed to the test suite and the whole ci pipeline, it could reduce the manual effort to deal with finding of static analysis tools.
But leaving out the static analysis tools would loose so much value.
We combine static analysis + LLMs to do better detection, triaging and auto-fixing because static analysis alone is broken in many ways.
We've been able to reduce ~30% of tickets for customers with false positive detection, and now be able to detect classes of vulnerabilities in business and code logic that were previously undetectable.
this pseudo-code as an example:
snprintf(fmt,userinputstring,args); printf(fmt,somearray);
“Autonomous AI is dangerous”
“pfft, are you worried about X outcome? We already had it”
I'd rather have SQL API taking not strings but a special type that string can't be directly converted into without escaping (by default).
In C++ tagged literals could be used to create this special type easily. Similar constructs exist in some other languages
JS and PHP has tagged literals
But they have to be “escaped” properly before being interpolated!
If you look at the level of the discussion around this, it's not surprising SQL injections are still a thing.
https://stackoverflow.com/questions/12430208/using-a-prepare...
Instead of just having :userId as a parameter that gets safely put in a query, it feels like there should be something like SORT_EXPRESSION(:orderBy) and for other common use cases, like in the sibling comment.
I have no idea whether this would fit in better as something handled by an ORM or the RDBMSes, but it probably doesn’t belong as the responsibility of the average developer, judging by the code I’ve seen.
I think the argument about needing to fix mechanisms that are commonly misused is a really good one, but there are no very clear solutions, I’m sure there can be found plenty wrong and overly trivialized with the suggestion above.
In the mid aughts, one of my lecturers insisted that motion capture was limited to a few minutes because "several megabytes" was "too much" to store.
Those things that we see as problems are exactly the things that our civilization relies on. Every time you make a purchase you rely on the fact that meatware AI corporations exploit environment and employees ruthlessly.
Every time you enjoy safety you rely on the fact that meatware military AIs got hellbent on acquiring the most dangerous hardware for themselves and make assessments that not using that hardware in any serious manner is more beneficial to them.
All the development of humanity comes from doing those problematic and horrible things more efficiently. That's why automating it with silicon AI is nothing new and nothing wrong.
I'm afraid that to evolve away from those problems we'd need paradigm shift in what humanity actually is. Because as it is now any AI, meatware or hardware will eventually get aligned with what humans want regardless of how problematic and horrible humans find the stuff they want.
It's a bit like with veganism. Killing animals is horrible but humanity largely remains dependent on that for its protein intake. And any strategic improvements in animal welfare came form new technologies applied to raising and killing animals at scale. In absence of those technologies welfare of animals that could feed growing human population would be far worse.
There's always of course the danger of brief period of misalignment as new technologies come to existence. We paid for industrial revolution with two world wars until the meatware AIs learned. Surprisingly they managed to learn things about nuclear technology with relatively minor loss of life (<1 million). But the overarching motif is that learning faster is better. So silicon AIs are not some new dangerous technology but rather a tool for already existing and entrenched AIs to learn faster of what doesn't serve their goals.
I'm not sure if it's better or worse that the computers can do that while the AI running on them get confused and mix things up.
> And they don’t have a self preservation instict like people with bodies do.
Not so sure about that, self preservation is an instrumental goal for almost anything else. Even a system that doesn't have any self-awareness, but is subject to a genetic algorithm, would probably end up with that behaviour.
Corporations (and bureaucracies) don't follow the same maths as evolution — although they do mutate, merge, split, share memes, etc., the difference is that "success" isn't measured in number of descendants.
But even then, organisations that last, generally have their own survival encoded into their structure, which may or may not look like any particular individual within also wanting the organisation to continue.