More

agibsonccc · on Sept 29, 2023

I've been looking in to this for the java world. What's your use case? Deployment in to existing applications?

jarym · on Sept 29, 2023

Yea exactly - Python for training, Java/.NET for inference at production. I looked at approaches like GRPC and things but my case is a bit more time-sensitive and the latency added by going over a network layer was too much.

For now I'm happy with Pytorch->ONNX and then running the ONNX model directly. But as I said, that means I can't easily train using JAX :-(

patrickkidger · on Sept 29, 2023

You can do JAX->TF->ONNX I believe: https://github.com/patrick-kidger/equinox/blob/main/equinox/...

jarym · on Sept 29, 2023

Ohh, I'll check that out!

agibsonccc · on Oct 11, 2021

Disclaimer: I am involved with the eclipse foundation (not as part of the core staff)

Hi, I just want to expand on this a bit...

The "eclipse foundation" is actually not just sponsored by IBM. Nor is the IDE. As for what support it provides, that's actually spot on and is under a similar idea to what apache and linux provide. There are of course differences: but for most people the vague idea that open source foundations exist and provide some similar functions as a way to host projects is enough for this discussion

A lot of eclipse's revenue actually comes from working groups with Jakarta EE being the biggest one. There are also IOT, Automotive and many other working groups + projects under the eclipse umbrella.

agibsonccc · on June 22, 2020

If something were to be more "neutral" what would you hope to see exactly? Something performant is typically going to be framework/hardware specific.

quietbritishjim · on June 22, 2020

Sorry, I'm not sure what you mean by "neutral". Are you talking about my suggestion to avoid DeepStream? If so:

The frameworks that work on multiple types of hardware, like TensorFlow and (probably most popular now) PyTorch, have separate backends for their different targets. Each of these backends have huge amounts of platform-specific code, and in the case of the Nvidia backend, that code is written in terms of CUDA just as DeepStream is. That's how they achieve good performance even though the top-level API is hardware generic. The overwhelming majority of deep learning code, both the actual learning and the inference, is written in terms of these frameworks rather than NVidia's proprietary framework. Admittedly I haven't played with NVidia's library, but I highly doubt there's a serious performance difference - it's even possible that the open-source libraries are faster due to the greater community (/Google) effort to optimise them.

It does look like DeepStream does a lot more of the processing pipeline than just the inference. In that case it's going to be a lot more tricky to get the whole pipeline on the GPU using those TensorFlow or PyTorch. At the end of the day, if only DeepStream does what you need, I'm not saying you necessarily shouldn't use it - just that you should ideally attempt to avoid it if reasonably possible.

agibsonccc · on June 22, 2020

Hi Fareesh, I'd love to hear more about your use case. Email's in my profile.

agibsonccc · on June 22, 2020

Hi, could you describe your use case a bit? Just an alarm trigger for deer in the backyard?

alkonaut · on June 22, 2020

Yes, I'd point a camera at my precious vegetables and if a deer walks into the video feed, something that scares it off is triggered so it runs off before eating the whole garden.

agibsonccc · on June 22, 2020

Could you elaborate on some of the problems you had overall?

brutus1213 · on June 22, 2020

I've unsuccessfully dabbled in gstreamer in the past. I was doing a project this weekend, and the comments on this thread motivated to give it another shot .. after a couple of hours (2-4ish?), I was able to get video off the Pi to my desktop (on the same LAN) but the performance was pretty bad. I didn't optimize much yet but let me summarize the key issues I experienced with gstreamer these last few hours:

1) Very little documentation; poorly explained pipelines. I tried to read what docs I could find but things quickly devolved into trying out random gstreamer pipelines posted in comments. People don't explain why they use one particular element over another. So it felt like whack-a-mole.

2) Installing gstreamer on the Pi was a breeze. I wanted to pull video off the connected camera and sent to VLC on my desktop. Sounded like something that would work out-of-the-box? Nope. Kept seeing lots of stackoverflow comments of people stabbing in the dark, getting errors (or have the thing just sit there and not work) with very little feedback on what was wrong.

3) I have very little indication of what is hardware and what is software accelerated in my pipeline. I have no idea where latency is coming into my pipeline.

Overall .. my modern expectation for software frameworks is "batteries included" .. it is totally reasonable for sophisticated software tools to be complex .. but gstreamer is just not designed that way. While I got it to work, I see massive latency (likely because my pipeline is inefficient) and degraded quality (no idea why).

agibsonccc · on June 22, 2020

Few questions:

1. Did you build this for your own use cases? Interesting side project?

2. How do you feel about the need for base64 being a requirement on the endpoints? Isn't GRPC the wrong medium for this? Also, what do you see as the main limitations right now? The models?

snowzach · on June 22, 2020

1. I built it to integrate with Home Assistant and security systems. I was trying to use Tensorflow on a Raspberry Pi and the dependencies were a nightmare. Tensorflow in general is a nightmare to compile and run IMO. I got to thinking, what if I could make all the deps inside of a docker container. What if I could run it remotely. It was born out of that.

2. As for base64, I'm not sure of a better way to support sending raw image data over JSON (in REST mode) In some ways I think GRPC is a better medium than JSON (it supports either) as GRPC supports sending the RAW bytes. What leads you to believe GRPC isn't the right transport? Plus you can do it in a stream format if you want to do a lot of video.

The only limitations I can think of are that Tensorflow supports a myriad of CPU optimizations so providing a single container image that has all the right options is basically impossible. I created one that has what I think are some of the better options (AVX, SSE4.X) and then an image that basically should run on any 64 bit intel compatible CPU. To get optimized options you need to build the docker container yourself which can take the better part of a day on slower CPUs.

With that said, I also provide ARM32 and ARM64 containers that actually run semi-okay on Raspberry Pis and and other ARM SBCs. I can run the inception model on a Pi4 on a 1080p image in about 5 seconds which is pretty good IMO.

agibsonccc · on June 22, 2020

How did you set it up on the software side though? How flexible/customizable was it?

agibsonccc · on Dec 19, 2019

It's still pretty early days for ray yet. That being said, spark never really got the hang of doing machine learning properly. It "works" but not for newer workloads which ray is trying to support.

It's good someone is building a company around it. I could see them building services on top of it and build a SAAS like databricks did with spark.

I'll be curious to see how ray matures.

choppaface · on Dec 19, 2019

I agree that ML on Spark was only a limited hit—- iterative jobs would actually be feasible versus Hadoop—- I still have yet to find a better ETL and SQL tool, and that’s a big part of most ML projects.

I’m worried about Ray as a SAAS Co because so far it looks to me like they’re riding reinforcement learning hype. They’d need to really penetrate the users of Horovod and Tensorflow Distributed to get beyond a beach head. And what if TPUs and Cerebras become more common? Because then the maker for multi-machine workloads becomes smaller (definitely not zero though).

agibsonccc · on Dec 19, 2019

Your concerns are right on point. I agree that spark is a great sql/etl tool. My thinking was on the "math execution" part. Ray is able to doa bit more there. I do feel like there is a bit of hype riding going on here as well.

One interesting thing that could happen is the hardware gets better, and then these distributed schedulers might not be able to keep up with all the different options on the market.

There is also the tension of the hardware vendors wanting to give away things that only run on their chips vs the software makers who want things to run on every chip. It seems like there will be a lot of competition among the various infra players in the next few years now that nvidia is starting to have real competition now (even if it's not big yet)

choppaface · on Dec 19, 2019

Just to qualify that "math execution" part, the beauty of Ray is that you get threadpool-like features to speed up arbitrary python code. So not just parallelism, but state/variable sharing for relatively small data. So this is great for some optimizers and definitely RL (where your "math" is some really complicated simulation / loss logic), but Ray wouldn't make much sense for BLAS stuff. Am I missing something here?

Ray shows expertise in multi-machine that's lacking in stuff like Jax, Tensorflow, and PyTorch. Horovod nailed down a lot of the performance issues for SGD in particular, but is missing the sort of rapid deployment / distribution stuff in Ray. If only they could all work together ...

agibsonccc · on June 16, 2019

Graal itself still has limitations with many libraries yet, the biggest one being having any library that heavily uses reflection. I'd look at some of the work red hat is doing with some of its java libraries and quarkus: https://developers.redhat.com/blog/2019/03/07/quarkus-next-g...

toyg · on June 16, 2019

IIRC you also end up relying on a graal-provided version of python that may or may not be kept up-to-date by a project for which it's just a secondary target. Been there with IronPython and Jython, it's no fun.