5 Important Diagram Types
Here, I’ll show you how to analyze a runtime created dataset and extract meaningful insights with 5 diagram types for data analysis:
- Bar Chart
- Scatter Plot
- Histogram
- Box Plot
- Linechart
For this project, we’ll create a dataset, clean it, filter out the data, and create meaningful visualizations with those 5 types.
http://www.softwareschule.ch/examples/pydemo91.htm
First, let’s import the necessary libraries and load our employee dataset:
# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# Create Employee Dataset
data = {
'Employee_ID': range(1001, 1011),
'Name': ['Alice','Bob','Charlie','David','Emma','Max', 'Grace','Helen','Isaac','Julia'],
'Age': [25, 28, 35, 40, 22, 30, 45, 50, 29, 38],
'Department': ['HR','IT','IT','Finance','HR','Finance','IT', 'Marketing','HR','Finance'],
'Salary': [50000, 70000, 85000, 92000, 48000, 78000, 110000, 65000, 52000, 88000],
'Experience_Years': [2, 4, 10, 15, 1, 8, 20, 12, 3, 11],
'Performance_Score': [3.2, 4.5, 4.8, 3.7, 2.9, 4.2, 4.9, 3.8, 3.5, 4.1]
}
# Convert to DataFrame
df = pd.DataFrame(data)
# Display first few rows
print(df.head())
This we can transpile in maXbox, Rust or PHP with Python for Delphi:
//# Create Employee Dataset
execstr('data = { '+LF+
'"Employee_ID": range(1001, 1011), '+LF+
'"Name":["Alice","Bob","Charlie","David","Emma","Max","Grace","Helen","Isaac","Julia"],'+LF+
'"Age":[25, 28, 35, 40, 22, 30, 45, 50, 29, 38], '+LF+
'"Department":["HR","IT","IT","Finance","HR","Finance","IT","Marketing","HR","Finance"],'+LF+
'"Salary":[50000,70000,85000,92000,48000,78000,110000,65000,52000,88000],'+LF+
'"Experience_Years":[2, 4, 10, 15, 1, 8, 20, 12, 3, 11],'+LF+
'"Performance_Score":[3.2,4.5,4.8,3.7,2.9,4.2,4.9,3.8,3.5,4.1]'+LF+
'} ');
//# Convert to DataFrame
execstr('df = pd.DataFrame(data)');
//# Display first few rows
execstr('print(df.head())');
//Data Cleaning # Check for missing values
execstr('print(df.isnull().sum()); # Check data types print(df.dtypes)');
//# Convert categorical columns to category type
execstr('df[''Department''] = df[''Department''].astype(''category'')');
//# Add an Experience Level column
execstr('df[''Experience_Level''] = pd.cut(df[''Experience_Years''],'+LF+
'bins=[0,5,10,20], labels=[''Junior'',''Mid'',''Senior''])');
//# Show the updated DataFrame
execstr('print(df.head())');
//Find Employees with High Salaries
execstr('high_salary_df = df[df[''Salary''] > 80000]');
execstr('print(high_salary_df)');
//Find Average Salary by Department
execstr('print(df.groupby(''Department'')[''Salary''].mean())');
//Find the Highest Performing Department
execstr('print(f"Highest Performing Department: {df.groupby("Department")["Performance_Score"].mean().idxmax()}")');
Now, we create those 5 meaningful image visualizations using Matplotlib & Seaborn modules:
//Step 4: Data Visualization
//📊 1. Bar Chart — Average Salary by Department
execstr('import matplotlib.pyplot as plt');
execstr('import seaborn as sns');
execstr('plt.figure(figsize=(8,5))'+LF+
'sns.barplot(x=df[''Department''],y=df[''Salary''],estimator=np.mean,palette="coolwarm")'+LF+
'plt.title(''Average Salary by Department'', fontsize=14) '+LF+
'plt.xlabel(''Department'', fontsize=12) '+LF+
'plt.ylabel(''Average Salary'', fontsize=12) '+LF+
'plt.xticks(rotation=45) '+LF+
'plt.show() ');
//📈 2. Scatter Plot — Salary vs Experience
execstr('plt.figure(figsize=(9,5))'+LF+
'sns.scatterplot(x=df["Experience_Years"],y=df["Salary"],hue=df["Department"],palette="Dark2",s=100)'+LF+
'plt.title(''Salary vs Experience'', fontsize=14) '+LF+
'plt.xlabel(''Years of Experience'', fontsize=12) '+LF+
'plt.ylabel(''Salary'', fontsize=12) '+LF+
'plt.legend(title="Department",bbox_to_anchor=(1, 1),fontsize=8) '+LF+
'plt.show() ');
//📊 3. Histogram — Salary Distribution
execstr('plt.figure(figsize=(8,5)) '+LF+
'plt.hist(df["Salary"], bins=5, color="blue", alpha=0.7, edgecolor="black") '+LF+
'plt.title("Salary Distribution", fontsize=14) '+LF+
'plt.xlabel("Salary", fontsize=12) '+LF+
'plt.ylabel("Frequency", fontsize=12) '+LF+
'plt.show() ');
//📊 4. Box Plot — Salary by Department
execstr('plt.figure(figsize=(8,5)) '+LF+
'sns.boxplot(x=df["Department"], y=df["Salary"], palette="pastel") '+LF+
'plt.title("Salary Distribution by Department", fontsize=14) '+LF+
'plt.xlabel("Department", fontsize=12) '+LF+
'plt.ylabel("Salary", fontsize=12) '+LF+
'plt.xticks(rotation=45) '+LF+
'plt.show()
Matplotlib or Seaborn is a data visualization library in Python. The pyplot, a sublibrary of Matplotlib, is a collection of functions that helps in creating a variety of charts. Line charts are used to represent the relation between two data X and Y on a different axis. In this article, we will learn about line charts and matplotlib simple line plots in Python.
//📈 5. Linechart — Salary vs Experience
execstr('plt.figure(figsize=(9,5))'+LF+
'sns.lineplot(x=df["Experience_Years"],y=df["Salary"],hue=df["Department"],palette="Dark2")'+LF+
'plt.title(''Salary vs Experience mX5'', fontsize=14) '+LF+
'plt.xlabel(''Years of Experience'', fontsize=12) '+LF+
'plt.ylabel(''Salary'', fontsize=12) '+LF+
'#plt.legend(title="Department",bbox_to_anchor=(1, 1),fontsize=8) '+LF+
'plt.show() ');
And the graphical result each plt.show() will be:





To go further, try working with larger datasets, dive into more advanced Pandas functions, or explore machine learning with Scikit-learn like above with statistical methods (7 Data Science Statistical Methods — Code Blog).
Several types of visualizations are commonly used in DataScience using Python, including:
- Bar charts: Used to show comparisons between different categories.
- Line charts: Analyze trends over time or across many categories.
- Pie charts: Show proportions or percentages of different categories.
- Histograms: Used to show the distribution of a single variable.
- Heatmaps: Used to show the correlation between different variables.
- Scatter plots: Relationship between two continuous variables.
- Box plots: Shows distribution of a variable and identify outliers.