Machine Learning Model with TensorFlow.js

09 Apr 2023  Amiya pattanaik  5 mins read.

Use Case

Consider the possibility that we wish to develop a machine learning model that can predict the cost of a house based on its size in square feet. Let’s explore how artificial intelligence can be used to solve this very current issue.

To solve this real-World problem we’ll use the TensorFlow.js library, which provides an API for building and training machine learning models in JavaScript.

Steps involved to solve this problem

  • Load and process the training data.
  • Get the number of features in the data.
  • Prepare the Dataset for training.
  • Define the model architecture.
  • Compile the model.
  • Train the model using fitDataset.
  • Evaluate the model on the test data.
  • Log the evaluation result (optional).
  • Make a prediction on a single input

Code Snippet

Below is the complete code for your reference.

import * as tf from '@tensorflow/tfjs';
import '@tensorflow/tfjs-node';

const csvUrl = 'https://storage.googleapis.com/tfjs-examples/multivariate-linear-regression/data/boston-housing-train.csv';

async function run() {
  // Load and process the training data
  const bostonData = tf.data.csv(
    csvUrl, {
      columnConfigs: {
        medv: {
          isLabel: true
        }
      }
    });

  // Get the number of features in the data
  // Number of features is the number of column names minus one for the label.
  const numOfFeatures = (await bostonData.columnNames()).length - 1;

  // Prepare the Dataset for training.
  const flattenedDataset =
      bostonData
        .map(({xs, ys}) =>
             { return {xs:Object.values(xs), ys:Object.values(ys)} })
        .batch(10);

  // Define the model architecture
  const model = tf.sequential();
  model.add(tf.layers.dense({
    inputShape: [numOfFeatures],
    activation: 'relu',
    units: 10
  }));
  model.add(tf.layers.dense({
    activation: 'linear',
    units: 1
  }));

  // Compile the model
  model.compile({
    loss: 'meanSquaredError',
    optimizer: tf.train.adam(0.1)
  });

  // Train the model using fitDataset
  await model.fitDataset(flattenedDataset, {
    epochs: 50
  });

  const testData = tf.data.csv(
    'https://storage.googleapis.com/tfjs-examples/multivariate-linear-regression/data/boston-housing-test.csv', {
      columnConfigs: {
        medv: {
          isLabel: true
        }
      }
    });

  const flattenedTestDataset =
      testData
        .map(({xs, ys}) =>
             { return {xs:Object.values(xs), ys:Object.values(ys)} })
        .batch(10);

  // Evaluate the model on the test data
  const result = await model.evaluateDataset(flattenedTestDataset);

  // Log the evaluation result
  console.log(`Test Loss: ${result}`);

  // Make a prediction on a single input
  const singlePredictionData = tf.tensor2d([
    [0.1, 18.0, 2.31, 0, 0.54, 6.575, 65.2, 4.0900, 1, 296, 15.3, 396.90]
  ]);
  const prediction = model.predict(singlePredictionData);

  console.log(`predictions: ${prediction.dataSync()}`);
}

run();

Output:(values may change)

  • Test Loss: Tensor
    • 30.1479434967041
  • predictions: 11.206214904785156

Code analysis

In the example code, we used only the medv column in the columnConfigs object to specify it as the label column, and the rest of the columns are automatically assumed to be feature columns. This is because the tf.data.csv() function automatically infers the column types, so we don’t need to explicitly specify each column as a feature or label.

Prediction

Please nothe that, It’s a single row of data [0.1, 18.0, 2.31, 0, 0.54, 6.575, 65.2, 4.0900, 1, 296, 15.3, 396.90] which is used in the code sample with 12 features that we are using for prediction. This row represents a hypothetical house in the Boston area with the following features include:

  1. CRIM: Per capita crime rate by town (0.1)
  2. ZN: Proportion of residential land zoned for lots over 25,000 sq.ft. (18.0)
  3. INDUS: Proportion of non-retail business acres per town (2.31)
  4. CHAS: Charles River dummy variable (= 1 if tract bounds river; 0 otherwise) (0)
  5. NOX: Nitric oxides concentration (parts per 10 million) (0.54)
  6. RM: Average number of rooms per dwelling (6.575)
  7. AGE: Proportion of owner-occupied units built prior to 1940 (65.2)
  8. DIS: Weighted distances to five Boston employment centres (4.09)
  9. RAD: Index of accessibility to radial highways (1)
  10. TAX: Full-value property-tax rate per $10,000 (296)
  11. PTRATIO: Pupil-teacher ratio by town (15.3)
  12. MEDV: Median value of owner-occupied homes in $1000’s (Bk - 0.63)^2 where Bk is the proportion of blocks by town (396.9)

These features are used as input to the neural network model to make a prediction of the median value of owner-occupied homes in $1000s for the given property.

Note: To use the entire dataset for training, you can modify the flattenedDataset and flattenedTestDataset variables to include all the feature columns by changing the mapping function to include all xs values instead of just the medv column. Here’s an example code snippet:

const testData = tf.data.csv(
  'https://storage.googleapis.com/tfjs-examples/multivariate-linear-regression/data/boston-housing-test.csv', {
    columnConfigs: {
      medv: {
        isLabel: true
      }
    }
  });

const flattenedTestDataset =
  testData
    .map(({xs, ys}) =>
      { return {xs:Object.values(xs), ys:Object.values(ys)} })
    .batch(10);

const predictionData = flattenedTestDataset.map(({xs, ys}) => xs);

const predictions = model.predict(predictionData);

To summarize In this code, we load the test data from a CSV file using the tf.data.csv() method, and flatten it using the map() and batch() methods just like we did with the training data. We then extract the xs data (i.e., the input features) from the flattened dataset and pass it to the model.predict() function to generate predictions. The result is a Tensor containing the predictions for each input example in the test dataset.

Conclusion

All done! To forecast the price of a house depending on its square footage, we created a machine learning model in JavaScript using TensorFlow.js. I hope this will clear up any confusion you may have about how to write working code to build AI in a more usable way. Please visit my other AI-related writings on this website. Enjoy your reading!

We encourage our readers to treat each other respectfully and constructively. Thank you for taking the time to read this blog post to the end. We look forward to your contributions. Let’s make something great together! What do you think? Please vote and post your comments.

Amiya Pattanaik
Amiya Pattanaik

Amiya is a Product Engineering Director focus on Product Development, Quality Engineering & User Experience. He writes his experiences here.