Engineering and Technology
Learn how to clean data with Apache Spark in Python.
Working with large amounts of data can be challenging, especially when dealing with millions or billions of rows. If you have received data processing code that was written on a laptop and the data is relatively clean, chances are you have been tasked with transitioning a basic data process from prototype to production. However, you may have also encountered real-world datasets that have missing fields, unusual formatting, and significantly larger amounts of data. Even if you are new to this field, our course is designed to help you gain the necessary skills to prepare data processes using Python with Apache Spark. Throughout the course, you will learn important terminology, methods, and best practices that will enable you to create a high-performing, maintainable, and comprehensible data processing platform.
by DataCamp
Learn how to clean data with Apache Spark in Python.
by DataCamp
Learn how to build and test data engineering pipelines in Python using PySpark and Apache Airflow.
by DataCamp
Learn to clean data as quickly and accurately as possible to help your business move from raw data t...
by DataCamp
Learn to tame your raw, messy data stored in a PostgreSQL database to extract accurate insights.
by DataCamp
Develop the skills you need to clean raw data and transform it into accurate insights.
by DataCamp
Learn how to identify, analyze, remove and impute missing data in Python.
by DataCamp
Learn how to ensure clean data entry and build dynamic dashboards to display your marketing data.
by DataCamp
Learn how to build your own SQL reports and dashboards, plus hone your data exploration, cleaning, a...
by DataCamp
Learn how to import and clean data, calculate statistics, and create visualizations with pandas.
by DataCamp
Learn how to run big data analysis using Spark and the sparklyr package in R, and explore Spark MLIb...