python
-
Exploratory Data Analysis for Machine Learning|Part-1
Exploratory Data Analysis (EDA) is a critical first step in any machine learning project. It involves examining and visualizing datasets to uncover patterns, detect anomalies, and gain insights that inform data preprocessing and model selection. By using statistical summaries, visualizations like histograms and scatter plots, and correlation analyses, EDA helps data scientists understand the structure… Continue reading
-
Data Cleaning using Python | Part-7
This post follows up on Data Cleaning using Python | Part-6. The Z-score is a statistical method used to identify outliers in a dataset. It represents the number of standard deviations a data point deviates from the mean. In other words, the Z-score quantifies how far a particular value is from the average, relative to the… Continue reading
-
Data Cleaning using Python | Part-6
This post follows up on Data Cleaning using Python | Part-5. Handling the outliers Importance of Handling Outliers in Machine Learning Handling outliers before creating a machine learning model is crucial because it can significantly impact performance and accuracy. Since outliers are data points that deviate considerably from the rest, they can distort statistical measures like… Continue reading
-
Data Cleaning using Python| Part-5
This post follows up on Data Cleaning using Python | Part-4. Feature Scaling Feature scaling is a crucial transformation when preparing data for machine learning models. It ensures that all attributes operate on a similar scale, improving model performance and convergence speed. The two most common techniques for feature scaling are min-max scaling and standardization. Min-max… Continue reading
-
Data Cleaning using Python | Part-3
Skewness and the Log Transformation This post follows up on Data Cleaning using Python | Part-2 Calculating the Skewness Now, we will examine whether the SalePrice variable follows a normal distribution, as this assumption is essential for performing regression analysis. While there are several methods to assess normality, we will use a visual approach by… Continue reading
-
Data Cleaning using Python | Part-1
In the real world, unlike in tutorials, raw data often contains duplicates, missing values, and irrelevant information. To prepare this data for use in a machine learning project, it’s essential to clean and preprocess it. In this post, I’ll guide you through handling duplicates, addressing missing values, and identifying outliers. Additionally, I’ll demonstrate how to… Continue reading
-
Printing Strings in Python
Printing in Python is easy. What you need to do is to use the print() function. And it would help if you typed your string within quotation marks: If you like to see the second string block in a different line, use “\n”: You can concatenate two strings using a plus operator: It would help… Continue reading
-
Numeric Object Types in Python
Python’s built-in numeric object types include integers, floats and complex numbers. Numbers in Python support the normal mathematical operations. In this post you will have some code examples for integers and floats. Integers Floats There are modules allocated on numeric objects that you can use: Below you can download the codes for this Python session:… Continue reading
About Me
My name is Cenk, and I am an economist. I write on this internet site on economics, econometrics, finance, value-investing, programming, calculus, basketball, history, foods, books, self-improvement, well-being and productivity. This internet site is a personal blog, and the posts reflect my personal views and do not represent where I have been working.
For my academic works, please visit this site: https://cenkufukyildiran.academia.edu/
Posts related to financial markets, trading, investing and similar posts are not for financial advice purposes.
