Thursday 2 April 2020

Data & AI Platforms - Open Source Vs Managed Services (AWS vs Azure vs GCP)

Hello Everyone! In this post I am going to compare the open source and cloud based managed services for the Data and Artificial Intelligence platform.



OpenSource
AWS
Azure
GCP
Description
Ingest
Streaming
Apache Kafka
Kinesis  Streams/Firehose
Azure Event Hubs
Cloud Pub/Sub
Services that allow the mass ingestion of small data inputs, typically from devices  and sensors, to process and route the data.
IoT
Kaa
AWS IoT
Azure IoT
Cloud IoT Core
A cloud gateway for managing bidirectional communication with billions of IoT  devices, securely and at scale.
Messages
Apache ActiveMQ
Amazon SQS
Azure Service Bus
Cloud Pub/Sub
Supports a set of cloud-based, message-oriented middleware technologies
including reliable message queuing and durable publish/subscribe messaging.
Batch
Apache Spark
Data Pipeline
Azure Data Factory
Cloud Data Transfer
Processes and moves data between different compute and storage services, as well  as on-premises data sources at specified intervals. Create, schedule, orchestrate,
and manage data pipelines.
Store
InMemory
Redis
Amazon  ElastiCache
Azure Redis Cache
Cloud Memorystore
An in-memory–based, distributed caching service that provides a high-
performance store typically used to offload nontransactional work from a  database.
SQL OLTP
MySQL
Amazon
RDS/Aurora
Azure SQL Database
Cloud SQL/Cloud
Spanner
Managed relational database service where resiliency, scale, and maintenance are
primarily handled by the platform.
NoSQL-Key Value
Redis
Amazon  DynamoDB
Table Storage
Cloud Bigtable
A globally distributed, multi-model database that natively supports multiple data  models: key-value, documents, graphs, and columnar.
NoSQL-Indexed
Apache Cassandra
Amazon SimpleDB
Azure Cosmos DB
Firestore
Object
MinIO
Amazon S3
Azure Data Lake  Storage
Cloud Storage
Object storage service, for use cases including cloud applications, content  distribution, backup, archiving, disaster recovery, and big data analytics.
Cool
S3 IA
Azure Storage cool  tier
Coldline Storage
Cool storage is a lower-cost tier for storing data that is infrequently accessed and  long-lived.
Archive
Amazon Glacier
Azure Storage
archive access tier
Archive Storage
Archive storage has the lowest storage cost and higher data retrieval costs
compared to hot and cool storage.
Backup
AWS Backup
Azure Backup
Nearline Storage
Back up and recover files and folders from the cloud, and provide offsite protection  against data loss.
Process
MapReduce
Hadoop/Spark
Amazon EMR
Azure  HDInsight/Databrick
s
Cloud Dataproc
Managed Hadoop/Spark service.



Data Movement
Airflow
AWS Data  Pipeline
Azure Data Factory
Cloud Dataprep
Processes and moves data between different compute and storage services, as well  as on-premises data sources at specified intervals. Create, schedule, orchestrate,
and manage data pipelines.
Batch Computing
Apache Nifi
AWS Batch
Azure Batch
Cloud Dataflow
Run large-scale parallel and high-performance computing applications efficiently in  the cloud.
Serverless  Computing
Kubeless
AWS Lambda
Azure Functions
Cloud Functions
Runs code in response to events and automatically manages the computing  resources required by that code.
Analyze
Interactive
Presto
Amazon Athena
Data Lake Analytics
Cloud Datalab
Provides a serverless interactive query service that uses standard SQL for analyzing  databases.
SQL OLAP
Apache Kylin
Amazon Redshift
Azure Synapse
Analytics
BigQuery analytics
Cloud-based Enterprise Data Warehouse (EDW) that uses Massively Parallel
Processing (MPP) to quickly run complex queries across petabytes of data.
AI/ML
skLearn/Tensorflo  w
Amazon  SageMaker
Azure Machine  Learning
AI Platform
A cloud service to train, deploy, automate, and manage machine learning models.
Steam Analytics
Apache Flink
Amazon Kinesis  Analytics
Stream Analytics
Cloud Dataflow
Storage and analysis platforms that create insights from large quantities of data, or  data that originates from many sources.
Search Analytics
Elasticsearch
Amazon  Elasticsearch
Azure Search
Cloud Search
Delivers full-text search and related search analytics and capabilities.
AI/ML
Speech
Simon/Kaldi
Amazon
Transcribe/Polly
Cognitive Services -
Speech
Speech-to-Text
Enables both Speech to Text, and Text into Speech capabilities.
Vision
OpenCV
Amazon  Rekognition
Cognitive Services -  Computer Vision
Cloud Vision
Computer Vision: Extract information from images to categorize and process visual  data.
Face: Detect, identy, and analyze faces in photos.  Emotions: Recognize emotions in images.
NLP
NLTK/OpenNLP
Amazon  Comprehend
Cognitive Services -  Language
Cloud Natural Language API
Translation
OpenNMT
Amazon Translate
Cloud Translation
Conversational
Interface
RASA
Amazon Lex
Dialogflow Enterprise Edition
Video  intelligence
Amazon  Rekognition Video
Video Indexer
Video AI
Auto-generated  Models
TPOT/AutoKeras
AutoGluon
Automated  Machine Learning
AutoML
Fully Managed ML
skLearn/
Tensorflow
Amazon
SageMaker
Azure Machine
Learning
AI Platform
Visualize
BI & Reporting
BIRT
Amazon  QuickSight
Power BI
DataStudio
Business intelligence tools that build visualizations, perform ad hoc analysis, and  develop business insights from data.
Google Sheets


Govern
Access Control
OpenIAM
AWS IAM
MS Identity  Platform
Cloud IAM
Allows users to securely control access to services and resources while offering  data security and protection. Create and manage users and groups, and use
permissions to allow and deny access to resources.
Monitoring
Nagios
Amazon  CloudWatch
Azure Monitor
Cloud Monitoring
Comprehensive solution for collecting, analyzing, and acting on telemetry from  your cloud and on-premises environments.
Logging
Logstash/Graylog
Amazon  CloudWatch Logs
Log Analytics
Cloud Logging
Data Catalog
TrueDat
AWS Glue
Data Catalog
Data Catalog
A fully managed service that serves as a system of registration and system of  discovery for enterprise data sources
Hive Metastore
Amazon Athena Catalog
Manage
Workflow  Orchestration
Airflow
AWS Glue
Azure Logic Apps
Cloud Composer
Cloud technology to build distributed applications using out-of-the-box connectors
to reduce integration challenges. Connect apps, data and devices on-premises or in  the cloud.
Deployment
Terraform
AWS
CloudFormation
Azure Resource  Manager
Cloud Deployment  Manager
Provides a way for users to automate the manual, long-running, error-prone, and  frequently repeated IT tasks.
API management
API
Umbrella/APIman
API Gateway
API Management
Apigee/Cloud
Endpoints
A turnkey solution for publishing APIs to external and internal consumers.
DevOps
Gradle/Jenkins
AWS
CodeBuild/CodeC
ommit/CodeDepl  oy/CodePipeline
Azure DevOps
DevOps
Fully managed build service that supports continuous integration and deployment.
Compute
IaaS
OpenStack
Amazon EC2
Virtual Machines
Compute Engine
Virtual servers allow users to deploy, manage, and maintain OS and server
software. Instance types provide combinations of CPU/RAM. Users pay for what  they use with the flexibility to change sizes.
Containers
Kubernetes
Amazon Elastic  Container Service
Azure Kubernetes  Service/Azure
Service Fabric
Google Kubernetes  Engine
Azure Container Instances is the fastest and simplest way to run a container in  Azure, without having to provision any virtual machines or adopt a higher-level
orchestration service.
Auto Scaling
KEDA/Nomad
AWS Auto Scaling
Virtual Machine  Scale Sets
Autoscaling
Allows you to automatically change the number of VM instances. You set defined  metric and thresholds that determine if the platform adds or removes instances.
Load Balancing
Seesaw/LoadMast  er
Application Load  Balancer
Application  Gateway
Load balancing
Application Gateway is a layer 7 load balancer. It supports SSL termination, cookie-  based session affinity, and round robin for load-balancing traffic.








0 comments:

Post a Comment