Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We use spec A LOT to validate a huge config file/DSL thing for our internal ETL application.

My general feelings are:

- spec is great, you should use it, the composeability and flexibility are awesome features.

- I've never once used conformers - maybe I just don't "get it" (which if so, speaks badly of them I think since I've been heavily using spec for years), but the use cases for them seem strange to me and I feel they cause more confusion than they're worth. I wish they were separated out into more separate/optional functionality.

- It's SO MUCH more powerful than things like JSON schema, but that comes at the cost of portability - there's no way we could send our spec over the wire and have someone else in a different environment use it. But also, there's no way we could implement some of our features in a tool like JSON Schema [and have it be portable] ("Is the graph represented by this data-structure free of cycles?" "When parsed by the spark SQL query parser, is this string a syntactically valid SQL query?").

- Being able to spec the whole data input up front has saved hundreds of lines of error-checking code and allows us to give much better errors up-front to our users and devs

- Spec has a lot of really cool features for generative testing, but we rarely use them since we've implemented lots of complex specs where it's not really practical to implement a generator (i.e. "strings which are valid sql queries" or "maps of maps of maps which when loaded in a particular way meet all the other requirements and are also valid DAGs"). I feel torn about this because the test features are great, but the extreme extensibility of spec is what I love most about it. I haven't often found a scenario where I actually have a use for the generative features (either the data is so simple I don't need them, or so complex that they don't work).



At a previous job I streamed Clojure over a trusted wire and executed it remotely in some circumstances. We weren't using Spec (Prismatic Schema at the time), but it worked pretty well and we even validated our schemas in javascript via cljs-wrapped library. I don't see why this wouldn't be possible with Spec, although you're a bit limited by code that will execute everywhere.


> I've never once used conformers - maybe I just don't "get it" (which if so, speaks badly of them I think since I've been heavily using spec for years), but the use cases for them seem strange to me and I feel they cause more confusion than they're worth. I wish they were separated out into more separate/optional functionality

That's because you don't Spec your functions and macros.

A lot of people have only used Spec to validate data that enters and leaves the boundary of their application. Which is a great use of Spec, and I use Spec mostly for that as well.

But there is a whole other world where Spec was designed to validate your functions and macros as well.

That's where conformers make sense.

For macros, you can use conformers to help you with writing a macro, by using Spec to define a DSL and conform to parse it out for you. It both validates the macro DSL and makes it easier for you to parse it.

For functions, conform can be useful to assert the output is what you expect for some given input. Often times, the output might depend on what kind of input you got. Conform basically tells you the kind of input it was, so in your validation you can validate differently based on each kind conform tells you it received.


Your feeling on there being no way to transport this over the wire is puzzling to me but I admit I don't have all the details. My feeling is why not? Surely if we have wire formats for self describing binary objects that can then be serialized into an in memory structure, transporting a spec shouldn't be harder than that?


Not that there's _no_ way to transport it over the wire, it just requires the full environment (a JVM on the other end - because we have java-specific stuff like spark calls). I'd put it at an order-of-magnitude more complex than something declarative like a JSON schema which is pretty safe to execute anywhere.

I don't think this is a really big failing of spec - I don't know of ANY validation tools that don't have to compromise between power/extensibility/ease-and-safety-of-execution-somewhere-else. Maybe if you implemented some kind of uber-validator in purely functional prolog or something?


Specs in the general case require code execution, so you'd essentially need to execute that (untrusted) clojure on the other end of the wire.


Again, apologies if this sounds ignorant, but we have pretty standard practices now for sandbox execution of untrusted code. A LISP seems especially suitable for this type of task.


I don't have a clue how you would implement this. The difference in portability between spec and something like json-schema/protobuf/avro is that you can serialize the schema in these and then clojure and (say) python, go, java, C#, JavaScript applications can talk to one an other.

How would you propose to serialize clojure spec's and use them from a python app? Port the clojure compiler to python?


I second this emotion - "How would you check a spec from anything other than clj/cljs" is (IMO) the critical question here. Sure you could check out my git repo in a safe VM and execute it there, but that's a WHOLE lot more hassle than an XML or JSON schema. It's not just a language barrier thing.

There's nothing stopping spec predicates from making network calls, looping forever, etc. If I wanted to be able to call my spec from other apps I'm writing, I could package it as a library easily, but a workflow like rest call->get spec->validate data (which I've implemented many times for JSON schema for simpler things) wouldn't really be practical with spec (without at least setting some really tight restrictions on what features of spec you're allowed to use)

Again, not really a failing of spec, it's just not designed for that kind of workflow.


Yeah spec is quite a general thing, there are libraries which convert specs into database schemas, JSON schema, swagger etc

If communicating constraints to another environment is required they should help


Yeah not a fault of spec at all, which is really awesome and even inspirational IMO.


https://github.com/metosin/malli#serializable-functions

SCI (Small Clojure Interpreter) is 4500LOC and if you just want a barebones clojure interpreter to carefully evaluate '(> x y), you could probably fit it in 100 LOC clj. Ok you want to use Python, so 500 LOC py. Or port SCI, 4500LOC port is a few person-months given a reference implementation.


For the "strings which are valid sql queries" would it not be good enough to just hardcode some number of example sql queries? Say, 10-100 examples?


In fact we do very similar things - that is to say, of course spec cannot reverse-engineer my "this is valid sql" predicate to do it automatically and I end up having to hand-code a lot of generators anyways. The generation features of spec [for us, not for everyone] end up being a relatively minor value add compared to plain-ole test.check




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: