Table of Contents
Getting Started with FormulaML
Welcome! This guide will help you get up and running with FormulaML in just a few minutes. By the end, you’ll have created your first machine learning model entirely within Excel.
Installation
Prerequisites
- Microsoft Excel 2016 or later (Windows or Mac)
- Active internet connection for initial setup
- Excel add-ins enabled in your Office settings
Installing FormulaML
- Open Excel
- Go to Insert → Get Add-ins (or Office Add-ins)
- Search for “FormulaML” in the store
- Click Add to install the add-in
- Once installed, you’ll see FormulaML functions available with the
ML.
prefix
Verifying Installation
Test that FormulaML is working by typing this formula in any cell:
=ML.DATASETS.IRIS()
If you see <Dataset>
, FormulaML is installed correctly!
Key Concepts
Before diving in, let’s understand three essential concepts:
1. Object Handles
FormulaML uses “object handles” to manage complex data structures in Excel. When you see something like <Dataset>
or <LinearRegression>
, that’s an object handle. Think of it as Excel’s way of storing a complete dataset or model in a single cell.
Example:
- Cell A1:
=ML.DATASETS.IRIS()
returns<Dataset>
- This handle represents the entire Iris dataset (150 rows, 5 columns)
2. The ML Workflow in Excel
The typical machine learning workflow in FormulaML follows these steps:
- Load Data → Creates a DataFrame handle
- Split Data → Separates features (X) and target (y)
- Train Model → Uses ML.FIT to create a model handle
- Make Predictions → Uses ML.PREDICT with new data
- Evaluate → Check model performance with ML.EVAL.SCORE
3. Function Namespaces
All FormulaML functions are organized into logical namespaces:
ML.DATASETS.*
- Built-in datasets for learningML.DATA.*
- Data manipulation and explorationML.REGRESSION.*
- Regression algorithmsML.CLASSIFICATION.*
- Classification algorithmsML.CLUSTERING.*
- Clustering algorithmsML.EVAL.*
- Model evaluation functionsML.PREPROCESSING.*
- Data preparation tools
Your First Model
Let’s create a simple classification model using the famous Iris dataset. This dataset contains measurements of iris flowers and classifies them into three species.
Step 1: Load the Data
In cell A1, enter:
=ML.DATASETS.IRIS()
Result: <Dataset>
Step 2: Explore the Data
Let’s see what’s in our dataset. In cell A3, enter:
=ML.DATA.INFO(A1)
This shows the structure of your data: 150 samples with 4 features plus 1 target column.
To see a sample of actual data, in cell A5:
=ML.DATA.SAMPLE(A1, 5)
This displays 5 random rows from your dataset.
Step 3: Prepare Features and Target
We need to separate our features (flower measurements) from our target (species).
In cell B1 (select features - columns 0-3):
=ML.DATA.SELECT_COLUMNS(A1, {0,1,2,3})
In cell C1 (select target - column 4):
=ML.DATA.SELECT_COLUMNS(A1, 4)
Step 4: Split for Training and Testing
Good practice requires splitting data into training and testing sets.
In cell A10:
=ML.PREPROCESSING.TRAIN_TEST_SPLIT(B1, C1, 0.2, 42)
This splits 80% for training, 20% for testing. The result is an array of 4 handles:
- X_train (training features)
- X_test (test features)
- y_train (training target)
- y_test (test target)
Step 5: Create and Train a Model
Let’s use a Support Vector Machine (SVM) classifier. In cell A15:
=ML.CLASSIFICATION.SVM()
Result: <SVC>
(untrained model)
Now train it with your data. In cell A16:
=ML.FIT(A15, INDEX(A10,1), INDEX(A10,3))
Result: <SVC>
with 🧠icon (trained model)
This trains the SVM model using X_train and y_train from the split.
Step 6: Make Predictions
Use your trained model to predict on test data. In cell A18:
=ML.PREDICT(A16, INDEX(A10,2))
Result: Predictions for your test set
Step 7: Evaluate Performance
Check how well your model performs. In cell A20:
=ML.EVAL.SCORE(A16, INDEX(A10,2), INDEX(A10,4))
Result: A score between 0 and 1 (1 being perfect accuracy)
Congratulations! You’ve just built and evaluated your first machine learning model in Excel!
Understanding Your Results
What the Score Means
- Classification models: Score represents accuracy (0.95 = 95% correct predictions)
- Regression models: Score represents R² (how well the model fits the data)
- Scores closer to 1 indicate better performance
Object Handle Management
- Object handles persist during your Excel session
- Save your workbook to preserve handles between sessions
- Use descriptive cell labels to track what each handle represents
Next Steps
Now that you understand the basics, explore these areas:
📊 Work with Your Own Data
Learn to convert Excel data to DataFrames and apply ML to your datasets.
🤖 Try Different Algorithms
Explore other models like Linear Regression for numerical predictions or K-Means for clustering.
📈 Advanced Evaluation
Learn about cross-validation and hyperparameter tuning (Premium features).
💡 Complete Examples
Follow our step-by-step tutorials for real-world scenarios:
Common Questions
“What if I get an error?”
Check our troubleshooting guide for common issues and solutions.
“Can I use my own data?”
Yes! Learn how in our Working with Data guide.
“Which algorithm should I use?”
See our Models and Algorithms guide for choosing the right model.
“How do I improve accuracy?”
Explore model evaluation techniques and advanced features.
Quick Reference Card
Task | Formula Example |
---|---|
Load Iris dataset | =ML.DATASETS.IRIS() |
View data info | =ML.DATA.INFO(dataframe) |
Select columns | =ML.DATA.SELECT_COLUMNS(dataframe, {0,1,2}) |
Split data | =ML.PREPROCESSING.TRAIN_TEST_SPLIT(X, y, 0.2) |
Create SVM model | =ML.CLASSIFICATION.SVM() |
Train model | =ML.FIT(model, X_train, y_train) |
Make predictions | =ML.PREDICT(trained_model, X_test) |
Evaluate model | =ML.EVAL.SCORE(trained_model, X_test, y_test) |
Ready to dive deeper? Continue to Working with Data →