How to create an AI Model with Tensorflow(Summary)

import pandas as pd # Importing pandas
import matplotlib.pyplot as plt # Importing pyplot
import tensorflow as tf # Importing tensorflow
import numpy as np # Importing numpy

# Read in the insurance dataset
insurance = pd.read_csv("https://raw.githubusercontent.com/stedy/Machine-Learning-with-R-datasets/master/insurance.csv")

The csv data look like this

Pre-process Data with Normalisation and one-hot encoding it

from sklearn.compose import make_column_transformer
from sklearn.preprocessing import MinMaxScaler, OneHotEncoder
from sklearn.model_selection import train_test_split

# Create column transformer (this will help us normalize/preprocess our data)
ct = make_column_transformer(
    (MinMaxScaler(), ["age", "bmi", "children"]), # get all values between 0 and 1
    (OneHotEncoder(handle_unknown="ignore"), ["sex", "smoker", "region"]) # Performs one-hot encoding on categorical features, converting them into binary vectors.
)

# Create X & y
X = insurance.drop("charges", axis=1) # Features (input data)
y = insurance["charges"] # Target (output data)

# Build our train and test sets (use random state to ensure same split as before)
# This function splits the data into training and testing sets, with 80% of the data used for training and 20% for testing. The random_state parameter ensures reproducibility of the split.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Fit column transformer on the training data only (doing so on test data would result in data leakage)
ct.fit(X_train)

# Transform training and test data with normalization (MinMaxScalar) and one hot encoding (OneHotEncoder)
X_train_normal = ct.transform(X_train)
X_test_normal = ct.transform(X_test)

Now, Create and train our data model

# Set random seed
tf.random.set_seed(42)

# Build the model (3 layers, 100, 10, 1 units)
insurance_model_3 = tf.keras.Sequential([
  tf.keras.layers.Dense(100), # if getting error then put (100, input_shape=[1]) -> according to your dataset
  tf.keras.layers.Dense(10),
  tf.keras.layers.Dense(1)
])

# Compile the model
insurance_model_3.compile(loss=tf.keras.losses.mae,
                          optimizer=tf.keras.optimizers.Adam(),
                          metrics=['mae'])

# Fit the model for 200 epochs (same as insurance_model_2)
history_2 = insurance_model_3.fit(X_train_normal, y_train, epochs=200, verbose=0)
# Evaulate 3rd model
insurance_model_3_loss, insurance_model_3_mae = insurance_model_3.evaluate(X_test_normal, y_test)
# Compare modelling results from non-normalized data and normalized data
insurance_model_2_mae, insurance_model_3_mae

From this we can see normalizing the data results in 10% less error using the same model than not normalizing the data.

This is one of the main benefits of normalization: faster convergence time (a fancy way of saying, your model gets to better results faster).

Opt: You can also see how changing the 'epochs' make the difference and upto which point

💡