This is the complete guide to understanding dimensions in machine learning. (New to ML ? ) Dimension in Machine Learning The number of input variables (or feature columns) in the given dataset i s termed as dimensions in machine learning. Example: Salary of employees based on designation and year of experience. Emp_num Designation Years_of_experience Salary in 1000$ 51 Software Engineer 2 15 108 Software Developer 5 45 67 Software Tester 4 28 89 Data Analyst 5 50 Here, there are 3 feature or input variables. And hence dimension is 3 in this case. In the above example Enum, designation, and year_of exp are the feature columns and Salary in 1000$ is the label or output column. What happens if you have high dimensions in the given dataset? This would be the same usual problem for the machine learning model as well to identify the patterns or relationships. Example: Salary of employees based on designation Emp_num Designation Emp_Age Gender Years_
Do you want to build a machine learning model? Don’t know what Dataset is? Confused about which type of dataset to be used while building the model? (New to ML? Read our Machine Learning Introduction . ) Then let’s get started with the quick guide to machine learning datasets. What is a Machine Learning(ML) Dataset? Dataset as the name says its a set of data. Dataset is a collection of data that is treated as a single unit for doing analytics and predictions . The dataset used in Machine learning problems can be a population or sample dataset. Most of the time the dataset used in machine learning is a sample dataset. Based on the patterns identified from this dataset the model makes predictions. Once the model is trained it is tested for accuracy and we look for the model working with the test dataset. Example: Let us consider the test scores dataset of a student. Subject Marked obtained Performance Level English 85