0% found this document useful (0 votes)
32 views15 pages

Price Opti Medium Code

Uploaded by

smityajah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
32 views15 pages

Price Opti Medium Code

Uploaded by

smityajah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 15
as ee ea as Source: skrillt108 A Practical Guide to Pricing Optimisation using Machine Learning Pricing optimisation stands as a pivotal element in business strategy, wielding a direct impact on both profitability and customer behaviour. & Varun Tyagi - Follow @ ruvisnedin operations Research Bit - 9 minread - Jan 18,2024 81 @Q w © 6 Introduction Historically, pricing decisions were frequently grounded in market trends, competition analysis, and intuition. Nevertheless, the emergence of machine learning has empowered businesses to harness data-driven insights, fostering more discerning pricing decisions. Examining the trends of GenAl in 2023, companies are increasingly gravitating towards advanced GenAl models to refine their pricing strategy. For illustrative purposes, we will navigate through a straightforward example, demonstrating how machine learning techniques can be deployed to optimise pricing using Python. The Scenario Consider a scenario where you have customer data, product data, and sales data. You want to optimise the pricing of your products based on customer characteristics and historical sales data. We'll go through each step of the process, from data generation to model training and, finally, pricing optimisation. Data Generation The provided Python code facilitates the generation of synthetic data encompassing customer, product, and sales information. This includes attributes such as customer age, gender, and location, along with product- related details like category, brand, price, and sales quantity. The primary objective is to forecast sales quantity based on customer age and product price. It is essential to acknowledge that the output is also fictitious, given that the data is randomly generated. Utilising Panda DataFrames to read from the CSV files and apply the models presents an alternative and a practical method. The logic and approach, however, remain the same. ‘import pandas as pd ‘import numpy as np from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.metrics import mean_squared_error from sklearn.linear_model import LinearRegression # Generate synthetic data np.random. seed (42) random: This is a submodule in NumPy that provides functions for generating pseudo-random numbers. seed(42): The number 42 is an arbitrary choice for the seed. it means that ever time you run the program, you will get the same sequence of random numbers. This is useful for debugging and ensuring that your results are reproducible Generate random customer data I will be using backslash (\) quite a few times to specify that the code is continuing in the next line customer_data = pd.DataFrame({ ‘customer_id': np.arange(1, 501), ‘customer_age': np.random.randint (18, 65, size=506), "customer_gender': np.random.choice(['Male', 'Female'], size=500), ‘customer_location': np.random.choice(['Urban', "Suburban', 'Rural'], \ size=500) » # Generate random product data product_data = pd.DataFrame({ ‘product_id': np.arange(1, 11), ‘product_category': np.random.choice(['Electronics', ‘Clothing’, ‘Home’, \ "sports'], size=19), "product_brand': np.random.choice(['Brand_A', 'Brand_8', 'Brand_C'], size=10 ‘product_price': np.random.uniform(S®, 500, size=10), ‘min_price': np.random.uniform(49,4900, size=10), ‘om': np.random.uniform(50,500, size = 16)/1000 » # Generate random sales data sales_data = pd.DataFrame({ ‘customer_id': np.random.choice(np.arange(1, 501), size=1000), ‘product_id': np.random.choice(np.arange(1, 11), size=1000), ‘sales_quantity': np.random.randint(1, 10, size=1000) » # Merge datasets to consolidate all the data and join on primary keys all_date all_data pd.merge(customer_data, sales_data, on="customer_id") pd.merge(all_data, product_data, on="product_id”) Data Preprocessing Before diving into model training, it’s crucial to preprocess the data. This includes merging datasets, defining features and target variables, and splitting the data into training and testing sets. The code also standardises numerical features using the standardscaler from scikit-learn. Define features such as what you want to evaluate the model on and target vari Apart from internal factors (features) such as customer age,CLV, marketing cam you can also include external factors such as GOP, Inflation, Unemployment rat salary levels, competition price, market trend index, time of the day, seasona weather conditions etc. It will make the model's predictions better. In this example, I am using only customer age and product price as features features ['customer_age', 'product_price'] # # As we want to optimise the price, it is essential for us to know the demand # of each of our products. Therefore, our target variable would be sales_quantit 4 = target_variable = 'sales_quantity' # Split the data into training and testing sets X_train, X_test, y_train, y_test = train_test_split(all_datalfeatures], \ all_data[target_variable], test_size=0.2, random_state=42) # Standardise numerical features scaler = StandardScaler () # # Applying the scale using fit_transorm for X_train # The fit method analyses the data and learns the transformation that needs to # be applied. After fitting, the transformer is ready to be used to transform # or preprocess new data. # Basically when you call fit_transform on the training set, the model learns # the parameters needed for transformation (e.g., mean and standard deviation # for standardisation based on the training data. # Tt accordingly scales or transforms the training data. # X_train_scaled = scaler. fit_transform(X_train) # # Applying the scale using only transform for X_test. Once the model has been # Fitted to the training data, it has learned the transformation parameters. # When you want to apply the same transformation to the testing set, you use # transform without fitting the model again. This ensures that the testing data # is scaled or transformed using the parameters learned from the training data. # This is important because in real-world scenarios, your model will encounter # new, unseen data, and you want to evaluate its performance on such data. # Using fit_transform on the testing set could lead to data leakage.If you were # to fit the scaler again on the testing set, it might learn different parameter # , and your evaluation would not accurately represent how the model would # perform on new, unseen data. # X_test_scaled = scaler.transform(X_test) Model Training In this example, a simple and explainable linear regression model is employed for predicting sales quantity. However, for more intricate scenarios, you have the flexibility to substitute it with advanced models like XGBoost, Random Forest, Decision Trees, Reinforcement learning mimicking human behaviour, Deep Learning models, and Ensemble models tailored to your specific use case. In our particular instance, Linear Regression will be utilised, wherein the model undergoes training on the training set, and subsequent predictions are generated on the test set. # Train a Linear regression model (you can replace it with your preferred model) model = LinearRegression() # You can also use the neural network model using TensorFlow (Keras) as follow model = tf.keras. Sequential ([ tf.keras. layers.Dense(64, activatior ‘input_shape tf.keras. layers.Dense(32, activatios tf.keras. layers.Dense(1) trelu', \ ‘X_train_scaled.shape[1],)), relu'), 1) # Compile the model using adam optimizer and MSE as a loss function model.compile(optimizer="adam', loss='mean_squared_error') model. fit(X_train_scaled, y_train, epochs=10, batch_size=32 validation_spli Now all the preprocessing has been done. It is time to train the model You can tune your hyperparameters in deep learning models as well such as epochs, batch_size, validation_split, optimizer, loss function, learning rate, the depth of decision trees activation function, # of hidden layers etc. Hyperparameters are those parameters over which you have control. The other parameters such as weights, biases, coefficients, split points and leaf values are something that model learns by itself model. fit(X_train_scaled, y_train) # Predict sales quantity for the test set yepred = model. predict (x_test_scaled) PredictionsandEvaluation > After training the model, predictions are made on the test set, and the model's performance is evaluated using the root mean squared error (RMSE). The lower the RMSE, the better the model’s predictive accuracy. There are other ways of evaluating the model depending on the model type such as: * Classification models: - Accuracy: The proportion of correct predictions. - Precision: The proportion of true positives among all positive predictions. - Recall: The proportion of true positives among all actual positives. - F1 score: The harmonic mean of precision and recall, balancing the two metrics. - False positive rate (FPR): The proportion of normal data points that are incorrectly classified as anomalies. - True positive rate (TPR): The proportion of anomalies that are correctly classified as anomalies. - Area under the ROC curve (AUC): A measure of overall model performance that considers both FPR and TPR. Regression models - Mean squared error (MSE): The average squared difference between predicted and actual values. - Root mean squared error (RMSE): The square root of MSE, which has the same units as the target variable. - R-Squared: The proportion of variance in the target variable that is explained by the model. - Mean Absolute Error (MAE): The average absolute difference between predicted and actual values. Clustering - Purity: The proportion of data points in each cluster that belong to the most frequent class. - Calinski-Harabasz Index (CHI): A measure of cluster separation and compactness. - Silhouette Coefficient: A measure of how well data points are assigned to clusters. One can also utilise different techniques such as manual search, grid search, random search, or bayesian optimisation to tune hyperparameters and train a model with better predictions. # Evaluate the model on the test set y.pred = nodel.predict (X_test_scaled) test_emse = mean_squared_error(y_test, y_pred, squared=False) print (f"Test RMSE: {test_rmse:.2f}") # Make predictions on the entire dataset all_data["predicted_sales"] = model.predict(scaler. transform(all_data[features]) # = # You can also choose to display sample predictions using the following code # print (all_data[['customer_id', 'product_id', ‘sales_quantity', 'predicted_sale + «head (19)) Pricing Opti Now comes the exciting part — using the model to optimise pricing. The isation code calculates a scaling factor based on the predicted sales values’ distribution. This factor adjusts the predicted sales, making them comparable and suitable for downstream processes. The adjusted sales are then used to calculate an adjusted product price. Additionally, a custom adjustment is applied based on the desired margin and minimum price constraints. The final adjusted price is determined by taking the maximum value among the adjusted product price, minimum price, and a price based on the desired margin. Now our predicted sales information is ready. It is time to optimise the pricing of our products. However before that we should also calculate mean centering, standard deviation, and scaling. Here we are applying scaling techniques to obtain the scaling factor. This fact ‘is calculated based on how far each predicted sale value deviates from the mea normalised by the standard deviation. It introduces a form of normalisation or scaling to the predicted sales. By applying scaling, we can standardise the sa for individual products. The use of mean and standard deviation in the scaling factor calculation implies that the process is influenced by the distribution the predicted sales. If there are outliers in the predicted sales, the scaling factor will be sensitive to them. Values that are far off the mean will have a higher scaling factor, that will influence the final outcome i.e. optimised pr ‘As you will observe that I have used 9.1 in the calculation of the scaling fac You can also adjust it if you want a larger or smaller impact of the factor. In summary, this step aims to adjust the predicted sales values based on their distribution, making them comparable and potentially more suitable for downstr processes, analyses, or decision-making. It's a common preprocessing step to ensure that the data behaves in an expected and a desirable way mean_sales = all_data["predicted_sales"].mean() std_sales = all_data["predicted_sales"].std() scaling_factor = 1+ (0.1 * (all_data["predicted_sales"] - mean_sales) / std_sa all_data['scf!] = scaling_factor.astype(float) # Calculate adjusted price based on the above variables all_data['adj_psc'] = all_data['product_price'] * all_data['scf"] Apply custom adjustment based on margin and minimum price. In this Line of code, we are calculating another price based on either the margin that we want to maintain on the product or the minimum price below whic we cannot or do not want to go apbm = adjusted price based on the margin. The margins that we defined in our product data set that we generated above. all_data["apbm"] = all_data["product_price"] * (1 + all_data["cm"]) In the following Line of code we are taking the maximum of all the prices. Thi can be modified based on your business case of optimising the price of the pro # 4 # T am taking maximum of all the prices. However, it can also be adjusted to cus # based on your requirements # all_data['adj_price'] = np.maximum(\ np.maximum(all_data['min_price'], \ all_datal'adj_psc']) , \ all_data['apbm']) * # One can also use loop and if-else to adjust the logic. Note that # using a combination of loop and if-else statements is less efficient than vect 4 # for index, row in all_data.iterrows(): # if rowf'apbm'] <= row['adj_psc']: # all_data.at[index, 'adj_price'] = row['adj_pse'] # else: 4 all_data.atLindex, ‘adj_price'] = row['min_p'] # # Display the DataFrame with adjusted prices for a sample of 10 rows print(all_data[['product_price', 'scf', 'min_price',\ ‘adj_psc' , 'apbm' , 'adj_price']] .sample(n=10)) Conclusion This blog provided a step-by-step guide on using machine learning for pricing optimisation. The example used a linear regression model, but the principles apply to more advanced models as well. Keep in mind that real- world applications may require more extensive feature engineering, hyperparameter tuning, and consideration of external factors as mentioned in the comments section of the code. Implementing ML-based pricing optimisation can enhance your decision- making process, improve profitability, and adapt to dynamic market conditions. Experiment with different models and fine-tune the parameters to find the approach that best suits your business needs. Code pricing_optimisation/code/pricing_optimisation.py at main varuntyagi83/pricing_optimisation Contribute to varuntyagi83/pricing_optimisation development by creating an account on GitHub. github.com Pricing Model Machine Learning Optimization Pricing Optimization Data Science S Written by Varun Tyagi Coton) O 13 Followers - Writer for Operations Research Bit More from Varun Tyagi and Operations Research Bit C= ee a) © Varun Wasi Leveraging Large Language Models (LLMs) and Generative Al... In today’s interconnected and rapidly evolving global marketplace, businesses are... Sminread + Jan13,2024 62 Q Ww © 07®, operations Resear. in Operations Resear. Topics in Operations Research Linear programming, Integer programming, Dynamic programming, Game theory, Openinapp 7 @0i Medium = @ search @ Aiden do Rosai in Operations Research sit Al Chat Logs—The Hidden Goldmine Your Company Hasn't... Alis ajudgment-free zone. And the chat logs - analyzed with ChatGPT —are turning out to... ‘Tminread - Dec 31,2028 100 Q af a |. @ verun Ha01 Stock Price Prediction with Machine Learning Hello everyone! Welcome to the world of stock price prediction, where data-driven... Sign in (F write See all from Varun Tyagi ‘See all from Operations Research Bit Recommended from Medium @ Loan Nousi Dynamic Pricing Implementation through Data Science: Price... In this article, we will utilize flight data as our primary example to illustrate the... 10min read + Feb19, 2024 SH to Lists Predictive Modeling w/ Python 20 stories - 1033 saves @ Massimiiano Costacurta in Towards Data Science Dynamic Pricing with Multi-Armed Bandit: Learning by Doing Applying Reinforcement Learning strategies to real-world use cases, especially in dynami 16minread » Aug 16,2023 S) m2 Qs a Practical Guides to Machine Learning 1Ostories « 1235 saves Natural Language Processing 1319sstories - ©3) 809 saves © tehera Firdose Is the Price Right: A Machine Learning Approach to Price... Image by breadcrumbs + | 13minread | Dec 27,2023 e777 Q W data science and Al 40 stories - 112 saves 3200 3000 Revenue 2800 2600 @ semetgocer in Academy Team Price Optimization with Machine Learning: The Impact of Data... Nowadays, the impact of data science and especially machine learning in the business... 1Omintead + Dec tt,2023 Hm Q it ChenDataBytes Solve Optimization Problems: Exploring Linear Programmin: Price Optimization, Blending Optimization, Budget Optimization Tminread + Jan9,2024 © ‘isha yoo! Advanced Feature Engineering— Part | Feature engineering is transforming raw data into informative and relevant input variables. Tmintead + 1dayago ss Q ‘See more recommendations

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy