Featured Image

Time Series Forecasting With Prophet in Python

Time series forecasting can be challenging as there are many different methods you could use and many different hyperparameters for each method.

The Prophet library is an open-source library designed for making forecasts for univariate time series datasets. It is easy to use and designed to automatically find a good set of hyperparameters for the model in an effort to make skillful forecasts for data with trends and seasonal structure by default.

In this tutorial, you will discover how to use the Facebook Prophet library for time series forecasting.

After completing this tutorial, you will know:

  • Prophet is an open-source library developed by Facebook and designed for automatic forecasting of univariate time series data.
  • How to fit Prophet models and use them to make in-sample and out-of-sample forecasts.
  • How to evaluate a Prophet model on a hold-out dataset.

Let’s get started.

Time Series Forecasting With Prophet in Python

Time Series Forecasting With Prophet in Python
Photo by Rinaldo Wurglitsch, some rights reserved.

Tutorial Overview

This tutorial is divided into three parts; they are:

  • Prophet Forecasting Library
  • Car Sales Dataset
  • Load and Summarize Dataset
  • Load and Plot Dataset
  • Forecast Car Sales With Prophet
  • Fit Prophet Model
  • Make an In-Sample Forecast
  • Make an Out-of-Sample Forecast
  • Manually Evaluate Forecast Model
  • Prophet Forecasting Library

    Prophet, or “Facebook Prophet,” is an open-source library for univariate (one variable) time series forecasting developed by Facebook.

    Prophet implements what they refer to as an additive time series forecasting model, and the implementation supports trends, seasonality, and holidays.

    Implements a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects

    — Package ‘prophet’, 2019.

    It is designed to be easy and completely automatic, e.g. point it at a time series and get a forecast. As such, it is intended for internal company use, such as forecasting sales, capacity, etc.

    For a great overview of Prophet and its capabilities, see the post:

    The library provides two interfaces, including R and Python. We will focus on the Python interface in this tutorial.

    The first step is to install the Prophet library using Pip, as follows:


    Next, we can confirm that the library was installed correctly.

    To do this, we can import the library and print the version number in Python. The complete example is listed below.


    Running the example prints the installed version of Prophet.

    You should have the same version or higher.


    Now that we have Prophet installed, let’s select a dataset we can use to explore using the library.

    Car Sales Dataset

    We will use the monthly car sales dataset.

    It is a standard univariate time series dataset that contains both a trend and seasonality. The dataset has 108 months of data and a naive persistence forecast can achieve a mean absolute error of about 3,235 sales, providing a lower error limit.

    No need to download the dataset as we will download it automatically as part of each example.

    Load and Summarize Dataset

    First, let’s load and summarize the dataset.

    Prophet requires data to be in Pandas DataFrames. Therefore, we will load and summarize the data using Pandas.

    We can load the data directly from the URL by calling the read_csv() Pandas function, then summarize the shape (number of rows and columns) of the data and view the first few rows of data.

    The complete example is listed below.


    Running the example first reports the number of rows and columns, then lists the first five rows of data.

    We can see that as we expected, there are 108 months worth of data and two columns. The first column is the date and the second is the number of sales.

    Note that the first column in the output is a row index and is not a part of the dataset, just a helpful tool that Pandas uses to order rows.


    Load and Plot Dataset

    A time-series dataset does not make sense to us until we plot it.

    Plotting a time series helps us actually see if there is a trend, a seasonal cycle, outliers, and more. It gives us a feel for the data.

    We can plot the data easily in Pandas by calling the plot() function on the DataFrame.

    The complete example is listed below.


    Running the example creates a plot of the time series.

    We can clearly see the trend in sales over time and a monthly seasonal pattern to the sales. These are patterns we expect the forecast model to take into account.

    Line Plot of Car Sales Dataset

    Line Plot of Car Sales Dataset

    Now that we are familiar with the dataset, let’s explore how we can use the Prophet library to make forecasts.

    Forecast Car Sales With Prophet

    In this section, we will explore using the Prophet to forecast the car sales dataset.

    Let’s start by fitting a model on the dataset

    Fit Prophet Model

    To use Prophet for forecasting, first, a Prophet() object is defined and configured, then it is fit on the dataset by calling the fit() function and passing the data.

    The Prophet() object takes arguments to configure the type of model you want, such as the type of growth, the type of seasonality, and more. By default, the model will work hard to figure out almost everything automatically.

    The fit() function takes a DataFrame of time series data. The DataFrame must have a specific format. The first column must have the name ‘ds‘ and contain the date-times. The second column must have the name ‘y‘ and contain the observations.

    This means we change the column names in the dataset. It also requires that the first column be converted to date-time objects, if they are not already (e.g. this can be down as part of loading the dataset with the right arguments to read_csv).

    For example, we can modify our loaded car sales dataset to have this expected structure, as follows:


    The complete example of fitting a Prophet model on the car sales dataset is listed below.


    Running the example loads the dataset, prepares the DataFrame in the expected format, and fits a Prophet model.

    By default, the library provides a lot of verbose output during the fit process. I think it’s a bad idea in general as it trains developers to ignore output.

    Nevertheless, the output summarizes what happened during the model fitting process, specifically the optimization processes that ran.


    I will not reproduce this output in subsequent sections when we fit the model.

    Next, let’s make a forecast.

    Make an In-Sample Forecast

    It can be useful to make a forecast on historical data.

    That is, we can make a forecast on data used as input to train the model. Ideally, the model has seen the data before and would make a perfect prediction.

    Nevertheless, this is not the case as the model tries to generalize across all cases in the data.

    This is called making an in-sample (in training set sample) forecast and reviewing the results can give insight into how good the model is. That is, how well it learned the training data.

    A forecast is made by calling the predict() function and passing a DataFrame that contains one column named ‘ds‘ and rows with date-times for all the intervals to be predicted.

    There are many ways to create this “forecast” DataFrame. In this case, we will loop over one year of dates, e.g. the last 12 months in the dataset, and create a string for each month. We will then convert the list of dates into a DataFrame and convert the string values into date-time objects.


    This DataFrame can then be provided to the predict() function to calculate a forecast.

    The result of the predict() function is a DataFrame that contains many columns. Perhaps the most important columns are the forecast date time (‘ds‘), the forecasted value (‘yhat‘), and the lower and upper bounds on the predicted value (‘yhat_lower‘ and ‘yhat_upper‘) that provide uncertainty of the forecast.

    For example, we can print the first few predictions as follows:


    Prophet also provides a built-in tool for visualizing the prediction in the context of the training dataset.

    This can be achieved by calling the plot() function on the model and passing it a result DataFrame. It will create a plot of the training dataset and overlay the prediction with the upper and lower bounds for the forecast dates.


    Tying this all together, a complete example of making an in-sample forecast is listed below.


    Running the example forecasts the last 12 months of the dataset.

    The first five months of the prediction are reported and we can see that values are not too different from the actual sales values in the dataset.


    Next, a plot is created. We can see the training data are represented as black dots and the forecast is a blue line with upper and lower bounds in a blue shaded area.

    We can see that the forecasted 12 months is a good match for the real observations, especially when the bounds are taken into account.

    Plot of Time Series and In-Sample Forecast With Prophet

    Plot of Time Series and In-Sample Forecast With Prophet

    Make an Out-of-Sample Forecast

    In practice, we really want a forecast model to make a prediction beyond the training data.

    This is called an out-of-sample forecast.

    We can achieve this in the same way as an in-sample forecast and simply specify a different forecast period.

    In this case, a period beyond the end of the training dataset, starting 1969-01.


    Tying this together, the complete example is listed below.


    Running the example makes an out-of-sample forecast for the car sales data.

    The first five rows of the forecast are printed, although it is hard to get an idea of whether they are sensible or not.


    A plot is created to help us evaluate the prediction in the context of the training data.

    The new one-year forecast does look sensible, at least by eye.

    Plot of Time Series and Out-of-Sample Forecast With Prophet

    Plot of Time Series and Out-of-Sample Forecast With Prophet

    Manually Evaluate Forecast Model

    It is critical to develop an objective estimate of a forecast model’s performance.

    This can be achieved by holding some data back from the model, such as the last 12 months. Then, fitting the model on the first portion of the data, using it to make predictions on the held-pack portion, and calculating an error measure, such as the mean absolute error across the forecasts. E.g. a simulated out-of-sample forecast.

    The score gives an estimate of how well we might expect the model to perform on average when making an out-of-sample forecast.

    We can do this with the samples data by creating a new DataFrame for training with the last 12 months removed.


    A forecast can then be made on the last 12 months of date-times.

    We can then retrieve the forecast values and the expected values from the original dataset and calculate a mean absolute error metric using the scikit-learn library.


    It can also be helpful to plot the expected vs. predicted values to see how well the out-of-sample prediction matches the known values.


    Tying this together, the example below demonstrates how to evaluate a Prophet model on a hold-out dataset.


    Running the example first reports the last few rows of the training dataset.

    It confirms the training ends in the last month of 1967 and 1968 will be used as the hold-out dataset.


    Next, a mean absolute error is calculated for the forecast period.

    In this case we can see that the error is approximately 1,336 sales, which is much lower (better) than a naive persistence model that achieves an error of 3,235 sales over the same period.


    Finally, a plot is created comparing the actual vs. predicted values. In this case, we can see that the forecast is a good fit. The model has skill and forecast that looks sensible.

    Plot of Actual vs. Predicted Values for Last 12 Months of Car Sales

    Plot of Actual vs. Predicted Values for Last 12 Months of Car Sales

    The Prophet library also provides tools to automatically evaluate models and plot results, although those tools don’t appear to work well with data above one day in resolution.

    Further Reading

    This section provides more resources on the topic if you are looking to go deeper.

    Summary

    In this tutorial, you discovered how to use the Facebook Prophet library for time series forecasting.

    Specifically, you learned:

    • Prophet is an open-source library developed by Facebook and designed for automatic forecasting of univariate time series data.
    • How to fit Prophet models and use them to make in-sample and out-of-sample forecasts.
    • How to evaluate a Prophet model on a hold-out dataset.

    Do you have any questions?
    Ask your questions in the comments below and I will do my best to answer.

    Want to Develop Time Series Forecasts with Python?

    Introduction to Time Series Forecasting With Python
    Develop Your Own Forecasts in Minutes

    …with just a few lines of python code

    Discover how in my new Ebook:
    Introduction to Time Series Forecasting With Python

    It covers self-study tutorials and end-to-end projects on topics like:
    Loading data, visualization, modeling, algorithm tuning, and much more…

    Finally Bring Time Series Forecasting to

    Your Own Projects

    Skip the Academics. Just Results.

    See What’s Inside

    Covid Abruzzo Basilicata Calabria Campania Emilia Romagna Friuli Venezia Giulia Lazio Liguria Lombardia Marche Molise Piemonte Puglia Sardegna Sicilia Toscana Trentino Alto Adige Umbria Valle d’Aosta Veneto Italia Agrigento Alessandria Ancona Aosta Arezzo Ascoli Piceno Asti Avellino Bari Barletta-Andria-Trani Belluno Benevento Bergamo Biella Bologna Bolzano Brescia Brindisi Cagliari Caltanissetta Campobasso Carbonia-Iglesias Caserta Catania Catanzaro Chieti Como Cosenza Cremona Crotone Cuneo Enna Fermo Ferrara Firenze Foggia Forlì-Cesena Frosinone Genova Gorizia Grosseto Imperia Isernia La Spezia L’Aquila Latina Lecce Lecco Livorno Lodi Lucca Macerata Mantova Massa-Carrara Matera Messina Milano Modena Monza e della Brianza Napoli Novara Nuoro Olbia-Tempio Oristano Padova Palermo Parma Pavia Perugia Pesaro e Urbino Pescara Piacenza Pisa Pistoia Pordenone Potenza Prato Ragusa Ravenna Reggio Calabria Reggio Emilia Rieti Rimini Roma Rovigo Salerno Medio Campidano Sassari Savona Siena Siracusa Sondrio Taranto Teramo Terni Torino Ogliastra Trapani Trento Treviso Trieste Udine Varese Venezia Verbano-Cusio-Ossola Vercelli Verona Vibo Valentia Vicenza Viterbo

    Featured Image

    Why Do I Get Different Results Each Time in Machine Learning?

    Last Updated on August 27, 2020

    Are you getting different results for your machine learning algorithm?

    Perhaps your results differ from a tutorial and you want to understand why.

    Perhaps your model is making different predictions each time it is trained, even when it is trained on the same data set each time.

    This is to be expected and might even be a feature of the algorithm, not a bug.

    In this tutorial, you will discover why you can expect different results when using machine learning algorithms.

    After completing this tutorial, you will know:

    • Machine learning algorithms will train different models if the training dataset is changed.
    • Stochastic machine learning algorithms use randomness during learning, ensuring a different model is trained each run.
    • Differences in the development environment, such as software versions and CPU type, can cause rounding error differences in predictions and model evaluations.

    Let’s get started.

    Why Do I Get Different Results Each Time in Machine Learning?

    Why Do I Get Different Results Each Time in Machine Learning?
    Photo by Bonnie Moreland, some rights reserved.

    Tutorial Overview

    This tutorial is divided into five parts; they are:

  • Help, I’m Getting Different Results!?
  • Differences Caused by Training Data
  • Differences Caused by Learning Algorithm
  • Differences Caused by Evaluation Procedure
  • Differences Caused by Platform
  • 1. Help, I’m Getting Different Results!?

    Don’t panic. Machine learning algorithms or models can give different results.

    It’s not your fault. In fact, it is often a feature, not a bug.

    We will clearly specify and explain the problem you are having.

    First, let’s get a handle on the basics.

    In applied machine learning, we run a machine learning “algorithm” on a dataset to get a machine learning “model.” The model can then be evaluated on data not used during training or used to make predictions on new data, also not seen during training.

    • Algorithm: Procedure run on data that results in a model (e.g. training or learning).
    • Model: Data structure and coefficients used to make predictions on data.

    For more on the difference between machine learning algorithms and models, see the tutorial:

    Supervised machine learning means we have examples (rows) with input and output variables (columns). We cannot write code to predict outputs given inputs because it is too hard, so we use machine learning algorithms to learn how to predict outputs from inputs given historical examples.

    This is called function approximation, and we are learning or searching for a function that maps inputs to outputs on our specific prediction task in such a way that it has skill, meaning the performance of the mapping is better than random and ideally better than all other algorithms and algorithm configurations we have tried.

    • Supervised Learning: Automatically learn a mapping function from examples of inputs to examples of outputs.

    In this sense, a machine learning model is a program we intend to use for some project or application; it just so happens that the program was learned from examples (using an algorithm) rather than written explicitly with if-statements and such. It’s a type of automatic programming.

    • Machine Learning Model: A “program” automatically learned from historical data.

    Unlike the programming that we may be used to, the programs may not be entirely deterministic.

    The machine learning models may be different each time they are trained. In turn, the models may make different predictions, and when evaluated, may have a different level of error or accuracy.

    There are at least four cases where you will get different results; they are:

    • Different results because of differences in training data.
    • Different results because of stochastic learning algorithms.
    • Different results because of stochastic evaluation procedures.
    • Different results because of differences in platform.

    Let’s take a closer look at each in turn.

    Did I miss a possible cause of a difference in results?
    Let me know in the comments below.

    2. Differences Caused by Training Data

    You will get different results when you run the same algorithm on different data.

    This is referred to as the variance of the machine learning algorithm. You may have heard of it in the context of the bias-variance trade-off.

    The variance is a measure of how sensitive the algorithm is to the specific data used during training.

    • Variance: How sensitive the algorithm is to the specific data used during training.

    A more sensitive algorithm has a larger variance, which will result in more difference in the model, and in turn, the predictions made and evaluation of the model. Conversely, a less sensitive algorithm has a smaller variance and will result in less difference in the resulting model with different training data, and in turn, less difference in the resulting predictions and model evaluation.

    • High Variance: Algorithm is more sensitive to the specific data used during training.
    • Low Variance: Algorithm is less sensitive to the specific data used during training.

    For more on the variance and the bias-variance trade-off, see the tutorial:

    All useful machine learning algorithms will have some variance, and some of the most effective algorithms will have a high variance.

    Algorithms with a high variance often require more training data than those algorithms with less variance. This is intuitive if we consider the model approximating a mapping function from inputs and outputs and the law of large numbers.

    Nevertheless, when you train a machine learning algorithm on different training data, you will get a different model that has different behavior. This means different training data will give models that make different predictions and have a different estimate of performance (e.g. error or accuracy).

    The amount of difference in the results will be related to how different the training data is for each model, and the variance of the specific model and model configuration you have chosen.

    What Should I Do?

    You can often reduce the variance of the model by changing a hyperparameter of the algorithm.

    For example, the k in k-nearest neighbors controls the variance of the algorithm, where small values like k=1 result in high variance and large values like k=21 result in low variance.

    You can reduce the variance by changing the algorithm. For example, simpler algorithms like linear regression and logistic regression have a lower variance than other types of algorithms.

    You can also lower the variance with a high variance algorithm by increasing the size of the training dataset, meaning you may need to collect more data.

    3. Differences Caused by Learning Algorithm

    You can get different results when you run the same algorithm on the same data due to the nature of the learning algorithm.

    This is the most likely reason that you’re reading this tutorial.

    You run the same code on the same dataset and get a model that makes different predictions or has a different performance each time, and you think it’s a bug or something. Am I right?

    It’s not a bug, it’s a feature.

    Some machine learning algorithms are deterministic. Just like the programming that you’re used to. That means, when the algorithm is given the same dataset, it learns the same model every time. An example is a linear regression or logistic regression algorithm.

    Some algorithms are not deterministic; instead, they are stochastic. This means that their behavior incorporates elements of randomness.

    Stochastic does not mean random. Stochastic machine learning algorithms are not learning a random model. They are learning a model conditional on the historical data you have provided. Instead, the specific small decisions made by the algorithm during the learning process can vary randomly.

    The impact is that each time the stochastic machine learning algorithm is run on the same data, it learns a slightly different model. In turn, the model may make slightly different predictions, and when evaluated using error or accuracy, may have a slightly different performance.

    For more on stochastic and what it means in machine learning, see the tutorial:

    Adding randomness to some of the decisions made by an algorithm can improve performance on hard problems. Learning a supervised learning mapping function with a limited sample of data from the domain is a very hard problem.

    Finding a good or best mapping function for a dataset is a type of search problem. We test different algorithms and test algorithm configurations that define the shape of the search space and give us a starting point in the search space. We then run the algorithms, which then navigate the search space to a single model.

    Adding randomness can help avoid the good solutions and help find the really good and great solutions in the search space. They allow the model to escape local optima or deceptive local optima where the learning algorithm might get such, and help find better solutions, even a global optima.

    For more on thinking about supervised learning as a search problem, see the tutorial:

    An example of an algorithm that uses randomness during learning is a neural network. It uses randomness in two ways:

    • Random initial weights (model coefficients).
    • Random shuffle of samples each epoch.

    Neural networks (deep learning) are a stochastic machine learning algorithm. The random initial weights allow the model to try learning from a different starting point in the search space each algorithm run and allow the learning algorithm to “break symmetry” during learning. The random shuffle of examples during training ensures that each gradient estimate and weight update is slightly different.

    For more on the stochastic nature of neural networks, see the tutorial:

    Another example is ensemble machine learning algorithms that are stochastic, such as bagging.

    Randomness is used in the sampling procedure of the training dataset that ensures a different decision tree is prepared for each contributing member in the ensemble. In ensemble learning, this is called ensemble diversity and is an approach to simulating independent predictions from a single training dataset.

    For more on the stochastic nature of bagging ensembles, see the tutorial:

    What Should I Do?

    The randomness used by learning algorithms can be controlled.

    For example, you set the seed used by the pseudorandom number generator to ensure that each time the algorithm is run, it gets the same randomness.

    For more on random number generators and setting fixing the seed, see the tutorial:

    This can be a good approach for tutorials, but not a good approach in practice. It leads to questions like:

    • What is the best seed for the pseudorandom number generator?

    There is no best seed for a stochastic machine learning algorithm. You are fighting the nature of the algorithm, forcing stochastic learning to be deterministic.

    You could make a case that the final model is fit using a fixed seed to ensure the same model is created from the same data before being used in production prior to any pre-deployment system testing. Nevertheless, as soon as the training dataset changes, the model will change.

    A better approach is to embrace the stochastic nature of machine learning algorithms.

    Consider that there is not a single model for your dataset. Instead, there is a stochastic process (the algorithm pipeline) that can generate models for your problem.

    For more on this, see the tutorial:

    You can then summarize the performance of these models — of the algorithm pipeline — as a distribution with mean expected error or accuracy and a standard deviation.

    You can then ensure you achieve the average performance of the models by fitting multiple final models on your dataset and averaging their predictions when you need to make a prediction on new data.

    For more on the ensemble approach to final models, see the tutorial:

    4. Differences Caused by Evaluation Procedure

    You can get different results when running the same algorithm with the same data due to the evaluation procedure.

    The two most common evaluation procedures are a train-test split and k-fold cross-validation.

    A train-test split involves randomly assigning rows to either be used to train the model or evaluate the model to meet a predefined train or test set size.

    For more on the train-test split, see the tutorial:

    The k-fold cross-validation procedure involves dividing a dataset into k non-overlapping partitions and using one fold as the test set and all other folds as the training set. A model is fit on the training set and evaluated on the holdout fold and this process is repeated k times, giving each fold an opportunity to be used as the holdout fold.

    For more on k-fold cross-validation, see the tutorial:

    Both of these model evaluation procedures are stochastic.

    Again, this does not mean that they are random; it means that small decisions made in the process involve randomness. Specifically, the choice of which rows are assigned to a given subset of the data.

    This use of randomness is a feature, not a bug.

    The use of randomness, in this case, allows the resampling to approximate an estimate of model performance that is independent of the specific data sample drawn from the domain. This approximation is biased because we only have a small sample of data to work with rather than the complete set of possible observations.

    Performance estimates provide an idea of the expected or average capability of the model when making predictions in the domain on data not seen during training. Regardless of the specific rows of data used to train or test the model, at least ideally.

    For more on the more general topic of statistical sampling, see the tutorial:

    As such, each evaluation of a deterministic machine learning algorithm, like a linear regression or a logistic regression, can give a different estimate of error or accuracy.

    What Should I Do?

    The solution in this case is much like the case for stochastic learning algorithms.

    The seed for the pseudorandom number generator can be fixed or the randomness of the procedure can be embraced.

    Unlike stochastic learning algorithms, both solutions are quite reasonable.

    If a large number of machine learning algorithms and algorithm configurations are being evaluated systematically on a predictive modeling task, it can be a good idea to fix the random seed of the evaluation procedure. Any value will do.

    The idea is that each candidate solution (each algorithm or configuration) will be evaluated in an identical manner. This ensures an apples-to-apples comparison. It also allows for the use of paired statistical hypothesis tests later, if needed, to check if differences between algorithms are statistically significant.

    Embracing the randomness can also be appropriate. This involves repeating the evaluation procedure many times and reporting a summary of the distribution of performance scores, such as the mean and standard deviation.

    Perhaps the least biased approach to repeated evaluation would be to use repeated k-fold cross-validation, such as three repeats with 10 folds (3×10), which is common, or five repeats with two folds (5×2), which is commonly used when comparing algorithms with statistical hypothesis tests.

    For a gentle introduction to using statistical hypothesis tests for comparing algoritms, see the tutorial:

    For a tutorial on comparing mean algorithm performance with a hypothesis test, see the tutorial:

    5. Differences Caused by Platform

    You can get different results when running the same algorithm on the same data on different computers.

    This can happen even if you fix the random number seed to address the stochastic nature of the learning algorithm and evaluation procedure.

    The cause in this case is the platform or development environment used to run the example, and the results are often different in minor ways, but not always.

    This includes:

    • Differences in the system architecture, e.g. CPU or GPU.
    • Differences in the operating system, e.g. MacOS or Linux.
    • Differences in the underlying math libraries, e.g. LAPACK or BLAS.
    • Differences in the Python version, e.g. 3.6 or 3.7.
    • Differences in the library version, e.g. scikit-learn 0.22 or 0.23.

    Machine learning algorithms are a type of numerical computation.

    This means that they typically involve a lot of math with floating point values. Differences in aspects, such as the architecture and operating system, can result in differences in round errors, which can compound with the number of calculations performed to give very different results.

    Additionally, differences in the version of libraries can mean the fixing of bugs and the changing of functionality that too can result in different results.

    Additionally, this also explains why you will get different results for the same algorithm on the same machine implemented by different languages, such as R and Python. Small differences in the implementation and/or differences in the underlying math libraries used will cause differences in the resulting model and predictions made by that model.

    What Should I Do?

    This does not mean that the platform itself can be treated as a hyperparameter and tuned for a predictive modeling problem.

    Instead, it means that the platform is an important factor when evaluating machine learning algorithms and should be fixed or fully described to ensure full reproducibility when moving from development to production, or in reporting performance in academic studies.

    One approach might be to use virtualization, such as docker or a virtual machine instance to ensure the environment is kept constant, if full reproducibility is critical to a project.

    Honestly, the effect is often very small in practice (at least in my limited experience) as long as major software versions are a good or close enough match.

    Further Reading

    This section provides more resources on the topic if you are looking to go deeper.

    Related Tutorials

    Summary

    In this tutorial, you discovered why you can expect different results when using machine learning algorithms.

    Specifically, you learned:

    • Machine learning algorithms will train different models if the training dataset is changed.
    • Stochastic machine learning algorithms use randomness during learning, ensuring a different model is trained each run.
    • Differences in the development environment, such as software versions and CPU type, can cause rounding error differences in predictions and model evaluations.

    Do you have any questions?
    Ask your questions in the comments below and I will do my best to answer.

    Covid Abruzzo Basilicata Calabria Campania Emilia Romagna Friuli Venezia Giulia Lazio Liguria Lombardia Marche Molise Piemonte Puglia Sardegna Sicilia Toscana Trentino Alto Adige Umbria Valle d’Aosta Veneto Italia Agrigento Alessandria Ancona Aosta Arezzo Ascoli Piceno Asti Avellino Bari Barletta-Andria-Trani Belluno Benevento Bergamo Biella Bologna Bolzano Brescia Brindisi Cagliari Caltanissetta Campobasso Carbonia-Iglesias Caserta Catania Catanzaro Chieti Como Cosenza Cremona Crotone Cuneo Enna Fermo Ferrara Firenze Foggia Forlì-Cesena Frosinone Genova Gorizia Grosseto Imperia Isernia La Spezia L’Aquila Latina Lecce Lecco Livorno Lodi Lucca Macerata Mantova Massa-Carrara Matera Messina Milano Modena Monza e della Brianza Napoli Novara Nuoro Olbia-Tempio Oristano Padova Palermo Parma Pavia Perugia Pesaro e Urbino Pescara Piacenza Pisa Pistoia Pordenone Potenza Prato Ragusa Ravenna Reggio Calabria Reggio Emilia Rieti Rimini Roma Rovigo Salerno Medio Campidano Sassari Savona Siena Siracusa Sondrio Taranto Teramo Terni Torino Ogliastra Trapani Trento Treviso Trieste Udine Varese Venezia Verbano-Cusio-Ossola Vercelli Verona Vibo Valentia Vicenza Viterbo

    Recent Posts

    Archives