Apache Arrow Flight: A Framework for Fast Data Transport

fulafel · on Oct 16, 2019

It's interesting how much faster your laptop SSD is compared to these high end performance oriented systems. Keeping in mind that the localhost/tls-disabled number is a high bound. (Not singling out Arrow by any means, most others are slower. )

I wonder which came first, the petering off of wired network hardware perf improvement, or the software bottlenecks that become obvious if we try to use today's faster networks. 100 Mb ethernet came in 1995, gigE in 1999, 10 gigE in 2002 and gained adoption in a few years.. on that track we should have had 100gigE in 2006 and seen it in servers in 2008 / workstations in 2010. And switches / routers should have seen terabit ethernet in 2010. Today's servers(X) seem to be at about 25 GBe, and with multicore that's just 1-2 gigabits per core.

(X) according to https://www.supermicro.com/products/system/1U/

frumiousirc · on Oct 16, 2019

Network is now outpacing single core performance.

The same 25 Gbps claimed by the article can be achieved with a single-threaded ZeroMQ socket. That thread will be CPU bound. To break 25 Gbps, multiple I/O threads need to be engaged.

There are already greater than 100 Gbps network links while single-core speed has stagnated for many years. Multi-threaded or multi-streamed (like in the article) solutions are needed to saturate modern network links.

fulafel · on Oct 16, 2019

Yep, you can't do fast anything in software without parallelism in the modern world, as single thread performance improvements have stalled even harder than networking speed improvements. I guess my observation might be a symptom of parts of the software world staying behind at the single core performance level and only specialized applications bothering to do the major sw surgery that 10x or 100x parallellism requires.

jumpingmice · on Oct 15, 2019

More people should try high performance services with non-traditional protobuf implementations. The fact that every language has a generated parser in no way preclude you from parsing them yourself. Hand-rolled serialization of your outbound messages can also be really fast, and the C++ gRPC stack will just accept preformatted messages and put them on the wire. Finally the existence of gRPC itself should not make you feel constrained against implementing the entire protocol yourself. It’s just HTTP/2 with conventional headers.

wesm · on Oct 15, 2019

To be clear for anyone reading, we're parsing and generating the data-related protobufs ourselves, and retaining ownership of the memory returned by gRPC to obtain zero copy.

The C++ details are found in

https://github.com/apache/arrow/blob/master/cpp/src/arrow/fl...

jumpingmice · on Oct 15, 2019

Have you considered/tried using the new ParseContext instead of CodedInputStream? It is performance-oriented.

Edit: Apparently it's also the default in protobuf 3.10

wesm · on Oct 15, 2019

I wasn't aware of it but will take a look. Thanks!

wodenokoto · on Oct 16, 2019

A bit off topic, but since this is implemented using gRPC, I’d like to ask, what is RPC and how does one make an (g)RPC call?

My understanding is it’s a binary alternative to JSON/REST API and all google cloud platform services uses it, however, since I have not managed to figure out how to do a single interaction with RPC against gcp (or any other service), I am wondering if my understanding is completely wrong here.

Matthias247 · on Oct 16, 2019

RPC is a general term and stands for remote procedure call. You do a function call which might kind of look like a normal function call, but the actual function is executed on another host.

gRPC is one implementation of RPC, where HTTP/2 is used as a transport layer, and protocol buffers are used for data serialization. You typically use it be using the grpc framework: Generate code for a specific API, and then use the generated code and the client library to perform the call. There might however also be different ways, e.g. proxies to HTTP systems and server introspection mechanism that allow to perform calls without requiring the API specification.

riboflavin · on Oct 15, 2019

Also, for more on Arrow and Flight: https://www.dremio.com/understanding-apache-arrow-flight/

algorithmsRcool · on Oct 16, 2019

Are there any thoughts about where compression fits into this model?

I know networks are getting very fast but with this size of data I wonder if there are realizable gains left with modern algorithms like Snappy.

dikei · on Oct 16, 2019

Columnar formats generally compress very well due to the similarity between values in the same column.

RocketSyntax · on Oct 15, 2019

We are struggling with reliability when using mounting solutions for big data in S3. Would this help?

khc · on Oct 15, 2019

Define big data? Have you tried https://github.com/kahing/goofys/ ?

Disclaimer: I am the author

RocketSyntax · on Oct 15, 2019

Yes. 1PB. Although I don't remember the specifics about reliability; something about having to remount the entire fs if it wasn't 100% there.

khc · on Oct 15, 2019

What do you mean by not 100% there?

luizfelberti · on Oct 16, 2019

Probably the fact that S3 is eventually consistent and IO is not immediately visible to other clients?

I'd suggest that OP use something like https://aws.amazon.com/fsx/lustre/ if they really need to mount S3 as a filesystem (do read the documentation regarding distributed consistency though), but other than that using AWS solutions in general (Athena, Presto on EMR, Spark on EMR) will tend to use specialized S3 committers such as https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spar... and are quite good at minimizing this problem.

Other than that, read the S3 docs for this behavior.

khc · on Oct 16, 2019

Agreed that using native s3 solutions is better than using something that emulates posix on s3. Unfortunately the former isn't always possible.

iRobbery · on Oct 15, 2019

depends how you would use it i guess. And it seems quite bound to certain data formats.

I initially thought after reading the headline, data as in any kinds of bytes to replicate or something. But it is something else, mainly by reading "is focused on optimized transport of the Arrow columnar format (i.e. “Arrow record batches”) over gRPC"

StreamBright · on Oct 15, 2019

>> mounting solutions

Could you elaborate?

maximente · on Oct 15, 2019

meta: the https version works fine:

https://arrow.apache.org/blog/2019/10/13/introducing-arrow-f...