LVC20-101 Make BigData fly on Arm64 - Apache Arrow

Session Abstract

Slack channel to chat with speaker:

There are lots of data formats in the BigData world such as parquet file with Python(pandas), Spark dataframe, JSON, Avro, CSV, etc.

It would waste about 70-80% computation on data conversion and serialization/deserialization among different projects.

Apache Arrow addresses these issues and facilitates communication between many components with its high-speed in-memory representation for flat and hierarchical data. It would help to get 10-100x speedup on In-Memory analytics workloads.

Collaborating with Linaro LDCG, we validated Apache Arrow on Arm64 and delivered the Arm-related optimization for Arrow.
This session will cover an overview of Apache Arrow, a brief introduction to Arrow optimization with Arm crypto and Neon extension and patches status submitted to the community. You will see the benchmark statistics results and how to take advantage of ARMv8 characteristics to make your data fly.

Session Speakers


Senior software engineer (Arm)

Yuqi Gu currently works on Arm, serving as the committer for Apache Bigtop project. He is also an active contributor in Apache Arrow, MariaDB and RocksDB mainly focusing on performance optimization on Arm64.

comments powered by Disqus

Other Posts

Sign up. Receive Updates. Stay informed.

Sign up to our mailing list to receive updates on the latest Linaro Connect news!