iomrslmy – Cloud Content

iomrslmy
Comments Off on Apache Spark Partitions – AWS S3
February 17, 2024

Apache Spark Partitions – AWS S3

Reading data from AWS S3 When Apache Spark reads data from Amazon S3 (Simple Storage Service), the process of creating partitions is different from reading data from HDFS. In the case of S3, Spark does not directly align partitions with the concept of HDFS blocks, as there is no block-based storage system like HDFS in

iomrslmy
Comments Off on Apache Spark Partitions – HDFS
February 17, 2024

Apache Spark Partitions – HDFS

Reading data from HDFS When Apache Spark reads data from Hadoop Distributed File System (HDFS), the process of creating partitions is influenced by several factors. Here’s an overview of how Spark creates partitions when reading data from HDFS. HDFS Blocks The primary storage unit in HDFS is a block. By default, these blocks are commonly

iomrslmy
Comments Off on Execute Glue Job Locally in IntelliJ Without an AWS Account
January 7, 2024

Execute Glue Job Locally in IntelliJ Without an AWS Account

The blog post guides users on running an AWS Glue job locally within the IntelliJ environment without the need for an AWS account. It outlines the steps to execute Glue jobs, offering a practical solution for developers to test and debug their Glue scripts locally before deploying them to the AWS cloud. The approach enhances

iomrslmy
Comments Off on AWS Glue Job vs. EMR Spark Job Cost Comparison
December 16, 2023

AWS Glue Job vs. EMR Spark Job Cost Comparison

Deciding on the most cost-effective option for your Spark jobs can be tricky, as AWS Glue and EMR have distinct pricing models and capabilities. Let’s dive into a quick comparison to help you choose. Cost Comparison Considerations: Recommendation: It is essential to evaluate your specific use case, workload characteristics, and preferences to determine the most

Cloud Content

Author: iomrslmy

Apache Spark Partitions – AWS S3

Apache Spark Partitions – HDFS

Execute Glue Job Locally in IntelliJ Without an AWS Account

AWS Glue Job vs. EMR Spark Job Cost Comparison