Feature Engineering and Feature Selection with Python: A Practical Guide For Feature Crafting
Название: Feature Engineering and Feature Selection with Python: A Practical Guide For Feature Crafting
Автор: Charfaoui Younes
Издательство: Independently published
Формат: pdf, epub
Размер: 11.9 MB
Feature engineering is the process of using data domain knowledge to create features or variables that make Machine Learning (ML) algorithms work more efficiently. It’s a fundamental task for improving Machine Learning model performance and prediction accuracy. However, feature engineering can be very time-consuming, as it takes quite a bit of effort to effectively process variables in our datasets before using them in a model.
Feature engineering includes several processes, like:
• Filling missing values within a variable.
• Encoding categorical variables into numbers.
• Variable transformation.
• Creating or extracting new features from the ones available in your dataset
Throughout this book, you’ll learn many techniques for modifying features using all these processes.
With recent developments in Big Data, we’ve been given more access to data in general and high-dimensional data. Consequently, the performance of Machine Learning models has improved by a large margin. On the other hand, there are significant features often collected or generated by different sensors and methods that can influence the model accuracy in a harmful way that needs careful consideration, Not only that, but these features can demand a lot of computational resources to build and maintain the model.
For that, we need handy processes that contribute to the machine learning pipeline to build great models even with these kinds of features, In this book about feature engineering and feature selection techniques for machine learning with python in a hands-on approach, we will explore pretty much all you’ll need to know about feature engineering and feature selection.
Specifically, we’ll learn how to modify dataset variables to extract meaningful information to capture as much insight as possible, filter out unneeded features leaving datasets and their variables ready to be used in machine learning algorithms.
Often, newcomers to the field of machine learning may get confused between feature selection and feature engineering.
• Feature engineering allows us to create new features from the ones we already have to help the machine learning model make more effective and accurate predictions.
• Feature selection, on the other hand, allows us to select features from the feature pool (including any newly-engineered ones) that will help machine learning models make predictions on target variables more efficiently.
In a typical ML pipeline, we perform feature selection after completing feature engineering.
This book is divided into two parts— Feature Engineering and Feature Selection, we will start with feature engineering first, then we will move to the other section of feature selection.
The reader of this book should have some familiarity with Machine Learning. We’ll be using Python as a programming language, as well as data science libraries like:
You’ll need to prepare your workspace. I suggest using Anaconda since it’s pre-equipped with Python and all of the aforementioned libraries.