Hello Everyone! In this post I am going to compare the open
source and cloud based managed services for the Data and Artificial Intelligence
platform.
OpenSource
|
AWS
|
Azure
|
GCP
|
Description
|
||
Ingest
|
Streaming
|
Apache Kafka
|
Kinesis Streams/Firehose
|
Azure Event Hubs
|
Cloud Pub/Sub
|
Services that allow the mass ingestion of small data
inputs, typically from devices and
sensors, to process and route the data.
|
IoT
|
Kaa
|
AWS IoT
|
Azure IoT
|
Cloud IoT Core
|
A cloud gateway for managing bidirectional communication with
billions of IoT devices, securely and
at scale.
|
|
Messages
|
Apache ActiveMQ
|
Amazon SQS
|
Azure Service Bus
|
Cloud Pub/Sub
|
Supports a set of cloud-based, message-oriented middleware
technologies
including reliable message queuing and durable
publish/subscribe messaging.
|
|
Batch
|
Apache Spark
|
Data Pipeline
|
Azure Data Factory
|
Cloud Data Transfer
|
Processes and moves data between different compute and
storage services, as well as on-premises
data sources at specified intervals. Create, schedule, orchestrate,
and manage data pipelines.
|
|
Store
|
InMemory
|
Redis
|
Amazon ElastiCache
|
Azure Redis Cache
|
Cloud Memorystore
|
An in-memory–based, distributed caching service that
provides a high-
performance store typically used to offload
nontransactional work from a database.
|
SQL OLTP
|
MySQL
|
Amazon
RDS/Aurora
|
Azure SQL Database
|
Cloud SQL/Cloud
Spanner
|
Managed relational database service where resiliency,
scale, and maintenance are
primarily handled by the platform.
|
|
NoSQL-Key Value
|
Redis
|
Amazon DynamoDB
|
Table Storage
|
Cloud Bigtable
|
A globally distributed, multi-model database that natively
supports multiple data models:
key-value, documents, graphs, and columnar.
|
|
NoSQL-Indexed
|
Apache Cassandra
|
Amazon SimpleDB
|
Azure Cosmos DB
|
Firestore
|
||
Object
|
MinIO
|
Amazon S3
|
Azure Data Lake Storage
|
Cloud Storage
|
Object storage service, for use cases including cloud
applications, content distribution,
backup, archiving, disaster recovery, and big data analytics.
|
|
Cool
|
S3 IA
|
Azure Storage cool tier
|
Coldline Storage
|
Cool storage is a lower-cost tier for storing data that is
infrequently accessed and long-lived.
|
||
Archive
|
Amazon Glacier
|
Azure Storage
archive access tier
|
Archive Storage
|
Archive storage has the lowest storage cost and higher
data retrieval costs
compared to hot and cool storage.
|
||
Backup
|
AWS Backup
|
Azure Backup
|
Nearline Storage
|
Back up and recover files and folders from the cloud, and
provide offsite protection against
data loss.
|
||
Process
|
MapReduce
|
Hadoop/Spark
|
Amazon EMR
|
Azure HDInsight/Databrick
s
|
Cloud Dataproc
|
Managed Hadoop/Spark service.
|
Data Movement
|
Airflow
|
AWS Data Pipeline
|
Azure Data Factory
|
Cloud Dataprep
|
Processes and moves data between different compute and
storage services, as well as on-premises
data sources at specified intervals. Create, schedule, orchestrate,
and manage data pipelines.
|
|
Batch Computing
|
Apache Nifi
|
AWS Batch
|
Azure Batch
|
Cloud Dataflow
|
Run large-scale parallel and high-performance computing
applications efficiently in the cloud.
|
|
Serverless Computing
|
Kubeless
|
AWS Lambda
|
Azure Functions
|
Cloud Functions
|
Runs code in response to events and automatically manages the
computing resources required by that code.
|
|
Analyze
|
Interactive
|
Presto
|
Amazon Athena
|
Data Lake Analytics
|
Cloud Datalab
|
Provides a serverless interactive query service that uses standard
SQL for analyzing databases.
|
SQL OLAP
|
Apache Kylin
|
Amazon Redshift
|
Azure Synapse
Analytics
|
BigQuery analytics
|
Cloud-based Enterprise Data Warehouse (EDW) that uses
Massively Parallel
Processing (MPP) to quickly run complex queries across
petabytes of data.
|
|
AI/ML
|
skLearn/Tensorflo w
|
Amazon SageMaker
|
Azure Machine
Learning
|
AI Platform
|
A cloud service to train, deploy, automate, and manage machine
learning models.
|
|
Steam Analytics
|
Apache Flink
|
Amazon Kinesis
Analytics
|
Stream Analytics
|
Cloud Dataflow
|
Storage and analysis platforms that create insights from
large quantities of data, or data that
originates from many sources.
|
|
Search Analytics
|
Elasticsearch
|
Amazon Elasticsearch
|
Azure Search
|
Cloud Search
|
Delivers full-text search and related search analytics and
capabilities.
|
|
AI/ML
|
Speech
|
Simon/Kaldi
|
Amazon
Transcribe/Polly
|
Cognitive Services -
Speech
|
Speech-to-Text
|
Enables both Speech to Text, and Text into Speech capabilities.
|
Vision
|
OpenCV
|
Amazon Rekognition
|
Cognitive Services -
Computer Vision
|
Cloud Vision
|
Computer Vision: Extract information from images to categorize
and process visual data.
Face: Detect, identy, and analyze faces in photos. Emotions: Recognize emotions in images.
|
|
NLP
|
NLTK/OpenNLP
|
Amazon Comprehend
|
Cognitive Services -
Language
|
Cloud Natural Language API
|
||
Translation
|
OpenNMT
|
Amazon Translate
|
Cloud Translation
|
|||
Conversational
Interface
|
RASA
|
Amazon Lex
|
Dialogflow Enterprise Edition
|
|||
Video intelligence
|
Amazon Rekognition Video
|
Video Indexer
|
Video AI
|
|||
Auto-generated Models
|
TPOT/AutoKeras
|
AutoGluon
|
Automated Machine Learning
|
AutoML
|
||
Fully Managed ML
|
skLearn/
Tensorflow
|
Amazon
SageMaker
|
Azure Machine
Learning
|
AI Platform
|
||
Visualize
|
BI & Reporting
|
BIRT
|
Amazon QuickSight
|
Power BI
|
DataStudio
|
Business intelligence tools that build visualizations,
perform ad hoc analysis, and develop
business insights from data.
|
Google Sheets
|
||||||
Govern
|
Access Control
|
OpenIAM
|
AWS IAM
|
MS Identity Platform
|
Cloud IAM
|
Allows users to securely control access to services and resources
while offering data security and
protection. Create and manage users and groups, and use
permissions to allow and deny access to resources.
|
Monitoring
|
Nagios
|
Amazon CloudWatch
|
Azure Monitor
|
Cloud Monitoring
|
Comprehensive solution for collecting, analyzing, and acting
on telemetry from your cloud and
on-premises environments.
|
|
Logging
|
Logstash/Graylog
|
Amazon CloudWatch Logs
|
Log Analytics
|
Cloud Logging
|
||
Data Catalog
|
TrueDat
|
AWS Glue
|
Data Catalog
|
Data Catalog
|
A fully managed service that serves as a system of registration
and system of discovery for enterprise
data sources
|
|
Hive Metastore
|
||||||
Amazon Athena Catalog
|
||||||
Manage
|
Workflow Orchestration
|
Airflow
|
AWS Glue
|
Azure Logic Apps
|
Cloud Composer
|
Cloud technology to build distributed applications using out-of-the-box
connectors
to reduce integration challenges. Connect apps, data and devices
on-premises or in the cloud.
|
Deployment
|
Terraform
|
AWS
CloudFormation
|
Azure Resource
Manager
|
Cloud Deployment
Manager
|
Provides a way for users to automate the manual, long-running,
error-prone, and frequently repeated
IT tasks.
|
|
API management
|
API
Umbrella/APIman
|
API Gateway
|
API Management
|
Apigee/Cloud
Endpoints
|
A turnkey solution for publishing APIs to external and internal
consumers.
|
|
DevOps
|
Gradle/Jenkins
|
AWS
CodeBuild/CodeC
ommit/CodeDepl oy/CodePipeline
|
Azure DevOps
|
DevOps
|
Fully managed build service that supports continuous
integration and deployment.
|
|
Compute
|
IaaS
|
OpenStack
|
Amazon EC2
|
Virtual Machines
|
Compute Engine
|
Virtual servers allow users to deploy, manage, and
maintain OS and server
software. Instance types provide combinations of CPU/RAM.
Users pay for what they use with the flexibility
to change sizes.
|
Containers
|
Kubernetes
|
Amazon Elastic
Container Service
|
Azure Kubernetes
Service/Azure
Service Fabric
|
Google Kubernetes
Engine
|
Azure Container Instances is the fastest and simplest way
to run a container in Azure, without
having to provision any virtual machines or adopt a higher-level
orchestration service.
|
|
Auto Scaling
|
KEDA/Nomad
|
AWS Auto Scaling
|
Virtual Machine
Scale Sets
|
Autoscaling
|
Allows you to automatically change the number of VM instances.
You set defined metric and thresholds that
determine if the platform adds or removes instances.
|
|
Load Balancing
|
Seesaw/LoadMast er
|
Application Load Balancer
|
Application Gateway
|
Load balancing
|
Application Gateway is a layer 7 load balancer. It supports
SSL termination, cookie- based session
affinity, and round robin for load-balancing traffic.
|
0 comments:
Post a Comment