cloud-based tools for data science


Fully Integrated Visual Tools:


Watson Studio and Watson OpenScale cover the complete development life cycle for data science, machine learning, and AI tasks.

Microsoft Azure Machine Learning is a full cloud-hosted offering supporting the entire development life cycle.

H2O Driverless AI, although downloadable, offers a one-click deployment for standard cloud service providers.

Data Management in the Cloud:


Software-as-a-service (SaaS) versions of existing open source and commercial tools are available.

Cloud providers operate tools, handle backups, and install updates.

Examples include Amazon Web Services DynamoDB, Cloudant based on Apache CouchDB, and IBM Db2 as a service.

Cloud Data Integration Tools (ETL/ELT):


Informatica Cloud Data Integration and IBM’s Data Refinery are widely used.

Data Refinery, part of IBM Watson Studio, enables transforming raw data into consumable information in a spreadsheet-like interface.

Cloud Data Visualization Tools:


Major cloud vendors offer data visualization tools.

Datameer is an example of a smaller company providing a cloud-based data visualization tool.

IBM offers Cognos Business Intelligence Suite and Data Refinery for data exploration and visualization in Watson Studio.

Data Science Visualizations in Watson Studio:


Various visualizations in Watson Studio, including 3D bar charts, hierarchical edge bundling, scatter plots, heat maps, tree maps, pie charts, and word clouds.

Model Building as a Service:


Services like Watson Machine Learning (IBM) and AI Platform Training (Google) allow training and building models using various open-source libraries.

Model Deployment and Monitoring:


Model deployment is tightly integrated into the model-building process in commercial software.

SPSS Collaboration and Deployment Services is an example.

Models can be exported in open formats like PMML.

Cloud tools like Amazon SageMaker Model Monitor and Watson OpenScale enable continuous monitoring of deployed machine learning models.

Integration for Multiple Tasks:


Cloud-based tools often integrate multiple tasks, providing a seamless experience for users.

In summary, cloud-based tools offer integrated solutions for data science tasks, covering data management, integration, visualization, model building, deployment, and monitoring. The video emphasizes the trend of integration in these tools, making them versatile and user-friendly for various data science activities.


Comments

Popular posts from this blog

Lila's Journey to Becoming a Data Scientist: Her Working Approach on the First Task

Notes on Hiring for Data Science Teams

switch functions