Apache Spark™ unifies the processing of your data in batches and real-time streaming, using your preferred language: Python, SQL, Scala, Java, or R. It executes fast, distributed ANSI SQL queries for dashboards and ad-hoc reporting faster than most data warehouses. Users can perform Exploratory Data Analysis (EDA) on petabyte-scale data without having to resort to downsampling and train machine learning algorithms on a laptop, using the same code to scale to fault-tolerant clusters of thousands of machines.
The Kubernetes Operator for Apache Spark, developed by Google Cloud, handles Spark applications the same way as other K8s workloads. You can now deploy the operator on your Managed Kubernetes clusters in Nebius infrastructure. It is available for free in the new Kubernetes Apps category, along with other handy tools.