Review what you learned

 Foundational methodology, developed by John Rollins, consists of 10 stages, starting with Business Understanding and ending with Feedback.


CRISP-DM, an open-source data methodology, combines several stages into one and omits the Feedback stage, resulting in a six-stage data methodology.


The primary goal of the Business Understanding stage is to understand the business problem and determine the necessary data to answer the core business question.


During the Analytic Approach stage, you can choose from descriptive, diagnostic, predictive, and prescriptive analytic approaches and decide whether to use machine learning techniques.


In the Data Requirements stage, scientists identify the correct data content, formats, and sources needed for the specific analytical approach.


During Data Collection, expert data scientists revise data requirements and make critical decisions regarding the quantity and quality of data.


The Data Understanding stage encompasses constructing the dataset and answering whether it represents the data needed to solve the business problem.


Data scientists commonly apply statistical techniques such as mean, median, minimum, maximum, standard deviation, pairwise correlation, and histograms during the Data Understanding stage.


During Data Preparation, data scientists address missing or invalid values, remove duplicates, and validate that the data is properly formatted.


Feature engineering and text analysis are key techniques applied during the Data Preparation stage.


The Modeling stage aims for the data model to answer the business question, with data scientists testing multiple algorithms and assessing whether the data supports the answer.


The Evaluation stage consists of two phases: diagnostic measures and statistical significance testing, assessing the quality of the model and determining if adjustments are needed.


During Deployment, data scientists release the data model to targeted stakeholders, including solution owners, marketing staff, application developers, and IT administration.


In the Feedback stage, stakeholders and users evaluate the model and provide feedback to assess its performance, with the value of the data model dependent on its ability to iterate based on user feedback.





Comments

Popular posts from this blog

Lila's Journey to Becoming a Data Scientist: Her Working Approach on the First Task

Notes on Hiring for Data Science Teams

switch functions