Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We used NiFi...one of the worst experiences.

It installs like an appliance and feels like you are grappling with a legacy tool weighed down by a classic view on architecture and maintenance.

We had built a data pipeline and it was for very high-scale data. The theory of it was very much like a TIBCO type approach around data-pipelines.

Sadly the reality was also like a TIBCO type approach around data-pipelines.

One persons experience and opinion and I am super jaded by it due to some vendor cramming it down one of our directors throats who subsequently crammed it down ours when we warned how it would turn out. It ended up being a very leaky and obtuse abstraction that didn't belong in our data-pipeline when you planned how it was maintained longer-term.

I ultimately left that company. It had to do with as much of their leadership and tooling dictation as anything else, NiFi was one of many pains. I am sure there are places that are using NiFi who will never outgrow the tool so take it with a grain of salt.

Said company ultimately struggled for the very reasons those of us who left were predicting (the tooling pipeline was a mess and was thrashing on trying to get it right, constantly breaking by forcing this solution, along with others, into the flow. Lots of finger-pointing).

Sucks to have that: "I told you so..." moment when you never wanted that outcome for them....I just couldn't be a part of their spiral anymore.



Nifi is a very powerful tool, but also a very specific one, and a self described 'necessary evil'. It does one heck of a job at getting data from A to B though.


That was exactly how folks talked about TIBCO. I didn't find it very powerful. I felt it very cumbersome. When you need to think about Day 2 is where everything completely collapsed.

We were able to pass data around in incredible lightweight ways leveraging Spring sometimes even just leveraging RestRepositories and transforming the object to our data representation by hand, it was never more than 100 lines of code for the entire thing. You could spit one out in an hour...the time was really in composing them and ensuring the architecture reflected the world and was still sensible/manageable.

We ultimately faced issues with running microservices and the licensing cost of that. Our enterprise was sadly too big, they didn't realize they needed to price their internal infrastructure competitive to legacy vendors.

You could get a WAS box for 50k and cram so much on that server until it was bursting...price didn't change. On the other hand each microservice brought a cost which added up.

The economics didn't make sense and it was a new political battle to fight with someone who had zero understanding of marketing what they've built. It just wasn't worth it. Lambdas would have been an option or something more ephemeral/serverless...but the options just weren't there for us at the time.

Enter NiFi and this "new data pipeline" and the circus began.


> It installs like an appliance and feels like you are grappling with a legacy tool weighed down by a classic view on architecture and maintenance.

This is actually a fair and well-articulated point of view. NiFi is currently an "appliance" like you said. Worse, it's a Pet and not a Cattle.

I believe there is active work in the community to address some of that pain. For example, there was a recent addition to NiFi called "stateless NiFi" which enables NiFi to better run in Kubernetes and other "cloud" architectures.

It's not there yet, it's still what would be described as a "fat" application. But I believe that eventually NiFi will evolve to more like a command-and-control tool for the cloud and less like something you have to install directly to your hardware. We hopefully see the day where "NiFi-as-a-Service" exists, which would really be an improvement over the current model.


Would be a huge step.

It feels like the answer will end up being something totally different. The reality is that enterprises do well with appliances.

Selling them cattle is hard because the maintenance piece expects a certain level of hygiene, proactivity, discipline.

An appliance sits there and when the thing breaks, you call in someone to fix the box. That relationship between a customer and vendor surprisingly makes for a good selling environment/symbiosis.

It's the Cathedral and the Bazaar in another spectrum...


Can you elaborate by what you mean on a TIBCO like approach? I haven't used their tools, but would like to know more about the issues you ran into. What were examples of the leaky abstraction>?


I'd like to second this request. I have encountered event buses and ETL in a number of places over my career - I don't understand what the heck TIBCO does beyond something simple like RabbitMQ/ZeroMQ. How is this different from Pub-Sub (and its variants). Any pointers to books or blog posts would be really appreciated.


TIBCO is very much providing queuing/caching to shuttle data from one point to another.

The goal is even more-so to be the interconnect for all systems across a varied enterprise at a higher level. It's all pub-sub underneath the hood. Think cheap butts in seats doing the same work for a "negligible hit on performance".

In the same way you can plug random devices into outlets around the house all served by some powerplant you don't know (or even need to care about), TIBCO attempts to provide that same interface.

Data does need some restructuring, whether these are aggregations, transformations, etc. So they provide steps in the process where you can perform these operations through a drag and drop UI.

There is an input defined and an output defined in XML that you don't have to code, but is managed and can be seen. The engine beneath provides the lower layers of routing, bytecode, implementation letting you just drag blocks around on a screen "connecting things".

The goal is very pure: I have many people in my organization that know how data flows, not all of them are developers. How can I enable them to connect my organization without everyone needing to be a developer.

In theory and in practice are always the interesting observations. What I had seen happen (as was mentioned somewhere else) is that very strong developers became weak by relying on this tool (or merely left for adequately challenging work). When the world moved on to something else, so much had changed it was almost a career change to get back into development.

They went from understanding Java 3/4, JEE to Java 11, Spring, DI Frameworks....I saw a lot give up or move over to product management roles. This only made the tension between on-premise infrastructure teams and public cloud teams more divisive and toxic. I don't think it's anything uncommon in other areas, just feel like we've reached a full revolution in this particular space (and not the first revolution either).


Thanks for a clear explanation without dismissing the product as garbage. (it's in that space where techies hate it, but it must provide value since it's so expensive!)

Why do non-technical people need to understand the data flow? It seems like documentation (data dictionaries) would be preferred. Or, are they useful for very non-technical people, while TIBCO data flow understanding is useful for people who are data savvy but not tech savvy?


It is a butts in seats equation.

If you can have less expensive operators driving and mapping the world and place all the smarts in the pipes, you can drive down opex and divert cash to capex for competitive advantage.

Linux and much of the streaming software world is smart people, dumb pipes.

If you invert that you have more automation, predictability, control at lower cost. The risk is a lot of eggs in one basket and when the market turns, if the company you are buying from mismanages tech, if they can't keep pace with change...you go along for their ride. Every company big and small falls into this technical debt. I have maby opinions on why as I am sure many do.

There is a lot baked into that comment but the constant tug-of-war every CIO is trying to wrap their head around....how do we do more with less and gain an advantage.


TIBCO is garbage. They had a halo for a long time from Rendezvous/EMS. But their money maker was this integration suite called BusinessWorks. It was this horrifyingly complex application that forced you into these ruts so that it could compile Java code. I kid you not, the developer environment for complex code was notepad.exe.

They spent a bunch of money on M&A and eventually had to go private and buy out the founder.


IMO, there is no “TIBCO-like approach” to application integration c. 2020 any more than there is say an Oracle or AWS or Google approach to databases. It’s multi-paradigm, multi-usecase and polyglot. TIBCO as a vendor supports approaches and patterns ranging from event-driven functions to data streams and stateful orchestration to stateless mediation to choreography. The “runtimes” are built on anything from Golang, Python & Node to Java, Scala & .NET.

What you‘re referring to sounds like the legacy version of BusinessWorks 5.x that was launched back in 2001. The current generation of BusinessWorks 6.x provides Eclipse-based tooling just like closest alternatives like Talend, Mule, Fuse, etc. and deploys to 18+ PaaSes (k8s, swarm, GKE, AKS, etc.) or its own managed iPaaS aka TIBCO Cloud Integration. It’s aimed at Enterprise Integration specialists at a Global 2000 or F500.

If you‘re an app developer at a large bank/telco/retailer/airline building integration logic or stream processing or event-driven data pipelines, you‘re likely to use Project Flogo (flogo.io) It’s 3-clause BSD FLOSS and has commercial support and optionally commercial extensions available. Oh and you’re likely going to use Flogo apps with Apache Pulsar or Apache Kafka messaging. Both Pulsar and Kafka are available as commercially supported software from TIBCO (Rendezvous or EMS are our traditional proprietary messaging products). Flogo apps can deploy to TIBCO Cloud, dozen+ flavors of k8s, AWS Lambda, Google Cloud Run or as a binary on an edge device.

(Disclaimer: Product at TIBCO. Used to work on BW 6.0 back when the only PaaS was good ol’ Heroku)


Curious, did you have a preferred alternative?

I get the feeling you described, Nifi has a.. heavy and highly structured feel to it, but lighter alternatives are not as integrated, say... Airflow, Streamsets, AWS Glue, Kafka (different beast) etc.

That said, Nifi is incredibly powerful and complete considering it's open source and free.


> Sadly the reality was also like a TIBCO type approach around data-pipelines.

Thats exactly how it looks like, thanks for confirming. Will avoid.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: