```
import pandas as pd # Importing pandas
import matplotlib.pyplot as plt # Importing pyplot
import tensorflow as tf # Importing tensorflow
import numpy as np # Importing numpy
# Read in the insurance dataset
insurance = pd.read_csv("https://raw.githubusercontent.com/stedy/Machine-Learning-with-R-datasets/master/insurance.csv")
```

### Pre-process Data with Normalisation and one-hot encoding it

```
from sklearn.compose import make_column_transformer
from sklearn.preprocessing import MinMaxScaler, OneHotEncoder
from sklearn.model_selection import train_test_split
# Create column transformer (this will help us normalize/preprocess our data)
ct = make_column_transformer(
(MinMaxScaler(), ["age", "bmi", "children"]), # get all values between 0 and 1
(OneHotEncoder(handle_unknown="ignore"), ["sex", "smoker", "region"]) # Performs one-hot encoding on categorical features, converting them into binary vectors.
)
# Create X & y
X = insurance.drop("charges", axis=1) # Features (input data)
y = insurance["charges"] # Target (output data)
# Build our train and test sets (use random state to ensure same split as before)
# This function splits the data into training and testing sets, with 80% of the data used for training and 20% for testing. The random_state parameter ensures reproducibility of the split.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Fit column transformer on the training data only (doing so on test data would result in data leakage)
ct.fit(X_train)
# Transform training and test data with normalization (MinMaxScalar) and one hot encoding (OneHotEncoder)
X_train_normal = ct.transform(X_train)
X_test_normal = ct.transform(X_test)
```

### Now, Create and train our data model

```
# Set random seed
tf.random.set_seed(42)
# Build the model (3 layers, 100, 10, 1 units)
insurance_model_3 = tf.keras.Sequential([
tf.keras.layers.Dense(100), # if getting error then put (100, input_shape=[1]) -> according to your dataset
tf.keras.layers.Dense(10),
tf.keras.layers.Dense(1)
])
# Compile the model
insurance_model_3.compile(loss=tf.keras.losses.mae,
optimizer=tf.keras.optimizers.Adam(),
metrics=['mae'])
# Fit the model for 200 epochs (same as insurance_model_2)
history_2 = insurance_model_3.fit(X_train_normal, y_train, epochs=200, verbose=0)
```

```
# Evaulate 3rd model
insurance_model_3_loss, insurance_model_3_mae = insurance_model_3.evaluate(X_test_normal, y_test)
```

```
# Compare modelling results from non-normalized data and normalized data
insurance_model_2_mae, insurance_model_3_mae
```

From this we can see normalizing the data results in 10% less error using the same model than not normalizing the data.

This is **one of the main benefits of normalization: faster convergence time** (a fancy way of saying, your model gets to better results faster).

### Opt: You can also see how changing the 'epochs' make the difference and upto which point

💡

https://karpathy.github.io/2019/04/25/recipe/ # Read it to get it better