Participants will learn that data outliers are often not errors in the data, but sometimes the data points of most interest. Live demonstrations will reinforce why problem context is required to understand how to deal with outliers and why undertreating extreme values can introduce model bias. This session will cover a wide range of data preparation exercises ranging from data sandbox construction to the creation of training, test, and validation data sets for model development.
You Will Learn
- Prepare a data sandbox for predictive analytics
- Detect and treat missing data and data quality issues
- Match data representations to fitting project types
- Construct various data transformations
- Handle data outliers without biasing model performance
- Build ‘train / test / validation’ data sets for model development
- Leave with resources, skills and plans to confidently process raw data for analytics
Geared To
- Analytic Practitioners
- Data Scientists
- Â IT Professionals
- Technology Planners
- Consultants
- Business Analysts
- Analytic Project Leaders