python

February 13, 2025

Exploratory Data Analysis for Machine Learning|Part-1

Exploratory Data Analysis (EDA) is a critical first step in any machine learning project. It involves examining and visualizing datasets to uncover patterns, detect anomalies, and gain insights that inform data preprocessing and model selection. By using statistical summaries, visualizations like histograms and scatter plots, and correlation analyses, EDA helps data scientists understand the structure… Continue reading

Datascience, Machine Learning

datascience, eda, exploratory data analysis, machine learning, penguins, python
February 11, 2025

Data Cleaning using Python | Part-7

This post follows up on Data Cleaning using Python | Part-6. The Z-score is a statistical method used to identify outliers in a dataset. It represents the number of standard deviations a data point deviates from the mean. In other words, the Z-score quantifies how far a particular value is from the average, relative to the… Continue reading

Datascience, Machine Learning

datascience, outliers, python, z-score
February 5, 2025

Data Cleaning using Python | Part-6

This post follows up on Data Cleaning using Python | Part-5. Handling the outliers Importance of Handling Outliers in Machine Learning Handling outliers before creating a machine learning model is crucial because it can significantly impact performance and accuracy. Since outliers are data points that deviate considerably from the rest, they can distort statistical measures like… Continue reading

Datascience, Machine Learning

box-plot, data cleaning, datascience, machine learning, outliers, python, scatter
February 4, 2025

Data Cleaning using Python| Part-5

This post follows up on Data Cleaning using Python | Part-4. Feature Scaling Feature scaling is a crucial transformation when preparing data for machine learning models. It ensures that all attributes operate on a similar scale, improving model performance and convergence speed. The two most common techniques for feature scaling are min-max scaling and standardization. Min-max… Continue reading

Datascience, Machine Learning, Python

data cleaning, data cleaning for machine learning, machine learning, python
January 28, 2025

Data Cleaning using Python | Part-3

Skewness and the Log Transformation This post follows up on Data Cleaning using Python | Part-2 Calculating the Skewness Now, we will examine whether the SalePrice variable follows a normal distribution, as this assumption is essential for performing regression analysis. While there are several methods to assess normality, we will use a visual approach by… Continue reading

Datascience, Machine Learning, Python

data cleaning, datascience, machine learning, python
January 2, 2025

Data Cleaning using Python | Part-1

In the real world, unlike in tutorials, raw data often contains duplicates, missing values, and irrelevant information. To prepare this data for use in a machine learning project, it’s essential to clean and preprocess it. In this post, I’ll guide you through handling duplicates, addressing missing values, and identifying outliers. Additionally, I’ll demonstrate how to… Continue reading

Datascience, Machine Learning, Python

datascience, describing data, importing data, python
May 29, 2023

Printing Strings in Python

Printing in Python is easy. What you need to do is to use the print() function. And it would help if you typed your string within quotation marks: If you like to see the second string block in a different line, use “\n”: You can concatenate two strings using a plus operator: It would help… Continue reading

Python

coding, python, python-programming, strings
April 16, 2023

Numeric Object Types in Python

Python’s built-in numeric object types include integers, floats and complex numbers. Numbers in Python support the normal mathematical operations. In this post you will have some code examples for integers and floats. Integers Floats There are modules allocated on numeric objects that you can use: Below you can download the codes for this Python session:… Continue reading

Python

programming, python

About Me

My name is Cenk, and I am an economist. I write on this internet site on economics, econometrics, finance, value-investing, programming, calculus, basketball, history, foods, books, self-improvement, well-being and productivity. This internet site is a personal blog, and the posts reflect my personal views and do not represent where I have been working.
For my academic works, please visit this site: https://cenkufukyildiran.academia.edu/
Posts related to financial markets, trading, investing and similar posts are not for financial advice purposes.

Cenk Yildiran

python

Exploratory Data Analysis for Machine Learning|Part-1

Data Cleaning using Python | Part-7

Data Cleaning using Python | Part-6

Data Cleaning using Python| Part-5

Data Cleaning using Python | Part-3

Data Cleaning using Python | Part-1

Printing Strings in Python

Numeric Object Types in Python

About Me

Recent Posts

Newsletter