Engineering and Technology
Learn how to make predictions from data with Apache Spark, using decision trees, logistic regression, linear regression, ensembles, and pipelines.
This course, Learn to Use Apache Spark for Machine Learning, is designed to provide you with the skills and knowledge necessary to effectively utilize Apache Spark for machine learning tasks. Spark is a powerful tool that is widely used for working with Big Data, and it excels in handling the distribution of compute tasks across a cluster. By leveraging Spark, you can perform operations quickly and efficiently, allowing you to focus on the analysis rather than getting caught up in technical details. Throughout this course, you will learn how to effectively import data into Spark and then delve into the three fundamental Spark Machine Learning algorithms: Linear Regression, Logistic Regression/Classifiers, and creating pipelines. These algorithms are essential in the field of machine learning and will equip you with the necessary tools to tackle a wide range of predictive modeling tasks. Additionally, you will have the opportunity to build and test decision trees, which serve as a great starting point for exploring machine learning models. By utilizing the 'Recursive Partitioning' algorithm, you will learn how to divide data into two classes and identify the most informative split within your data. This process is repeated with further nodes, allowing you to construct a decision tree that can be used to make predictions with new data. Furthermore, this course will provide you with a comprehensive understanding of logistic and linear regression in PySpark. Logistic regression models are crucial in classification tasks, and you will learn how to build and evaluate these models effectively. Additionally, you will explore linear regression models, which enable you to refine your predictors and focus on the most relevant options. By the end of this course, you will feel confident in applying your newly acquired machine learning knowledge. Throughout the course, you will engage in hands-on tasks and work with practice data sets, allowing you to gain practical experience and reinforce your understanding of the concepts covered.
by DataCamp
Learn how to make predictions from data with Apache Spark, using decision trees, logistic regression...
by DataCamp
Learn the gritty details that data scientists are spending 70-80% of their time on; data wrangling a...
by DataCamp
Learn the fundamentals of working with big data with PySpark.
by DataCamp
Learn how to build and test data engineering pipelines in Python using PySpark and Apache Airflow.
by DataCamp
This course teaches the big ideas in machine learning like how to build and evaluate predictive mode...
by DataCamp
Learn how to build and tune predictive models and evaluate how well they'll perform on unseen data.
by DataCamp
Learn how to use tree-based models and ensembles to make classification and regression predictions w...
by DataCamp
Grow your machine learning skills with scikit-learn in Python. Use real-world datasets in this inter...
by DataCamp
An introduction to machine learning with no coding involved.
by DataCamp
Learn the power of deep learning in PyTorch. Build your first neural network, adjust hyperparameters...