The better your code is, the less history you have to know to make sense of it. Every argument can have at least two interesting values (otherwise it wouldn't need to be an argument), the docs list 9 arguments here, which means there's at least 2^9=512 ways to invoke it, that's a lot of work to write, test, and remember... decouple such functions (split them up, remove dependencies on each other, string things are different than regex things are different than vector things). Some of the options are also mutually exclusive, don't give users incorrect ways to use the code, ie the problematic invocation should be structurally nonsensical (such as passing an option that doesn't exist), not logically nonsensical (where you have to emit a warning to explain it). Put metaphorically: replacing the front door in the side of the 10th floor with a wall is better than hanging a sign that warns against its use, but either is better than neither. In an interface, the function defines what the arguments should look like, not the caller (because the caller depends on the function, inferring everything that everyone might ever want to call it with makes the function depend on the callers, too, and this type of cyclical dependency will quickly clog a system up and never provide the benefits you expect). Be very wary of equivocating types, it's a design flaw that things like TRUE and 0 and "abc" are all vectors.
One type of test that seems to be worth mentioning in this thread is stress/performance/load tests which could be simply put as finding out the limits beyond which a certain piece of software breaks. Note that in terms of tooling it is essential to precisely determine the scope of what one proposes to stress tests from a system perspective. For instance in the case of a "web application" stress testing can include in its scope the web server application itself and so the tooling could intervene on that end. Here is a nice post about http load testing