Tag: Streaming
Data Analytics Official Blog Streaming Dec. 9, 2024Google Cloud named a Leader in the 2024 Gartner Magic Quadrant for Data Integration Tools - Google Cloud has been recognized as a Leader in the 2024 Gartner Magic Quadrant for Data Integration Tools. Google Cloud's unified data and AI capabilities, combined with its comprehensive suite of fully managed services, empower organizations to ingest, process, transform, orchestrate, analyze, and activate their data with unprecedented speed and efficiency.
BigQuery IoT Streaming Nov. 25, 2024Tracking 10,000 IoT drones with PubSub and BigQuery GeoSpatial - Google Cloud's Pub/Sub and BigQuery provide a scalable solution for tracking and analyzing data from IoT drones. Pub/Sub ensures reliable data ingestion, while BigQuery's geospatial functions unlock insights from location data. This architecture has practical applications in precision agriculture, delivery logistics, infrastructure inspection, disaster response, and more. Explore more about BigQuery's geospatial capabilities in the official documentation.
Cloud Dataproc Data Analytics Official Blog Streaming Nov. 18, 2024Dataproc Serverless: Now faster, easier and smarter - Dataproc Serverless now offers faster performance with native query execution in the Premium tier, improving query performance by ~47% in tests. It also introduces a built-in Spark UI for seamless monitoring and troubleshooting, eliminating the need for setting up and maintaining persistent history servers.
BigQuery Data Analytics Official Blog Streaming Oct. 14, 2024BigQuery tables for Apache Iceberg: optimized storage for the open lakehouse - BigQuery tables for Apache Iceberg, a fully managed, Apache Iceberg-compatible storage engine from BigQuery, offer optimized storage for the open lakehouse. It provides features like autonomous storage optimizations, clustering, and high-throughput streaming ingestion.
BigQuery Data Analytics Official Blog Streaming Oct. 14, 2024Using BigQuery Omni to reduce log ingestion and analysis costs in a multi-cloud environment - BigQuery Omni helps reduce the cost of log analytics in multi-cloud environments by eliminating the need for Apache Spark workloads and providing a unified querying process across cloud providers. It offers reduced engineering and compute resources, as well as lower egress costs.
Apache Flink Data Analytics Official Blog Streaming Oct. 14, 2024Real-time data for real-world AI with support for Apache Flink in BigQuery - BigQuery Engine for Apache Flink, now in preview, provides a serverless real-time intelligence platform. It allows users to easily migrate existing streaming applications relying on Apache Flink to Google Cloud without code rewriting or third-party services.
Cloud Dataflow Data Analytics Official Blog Streaming Oct. 7, 2024Mastering Dataflow: 5 In-Depth Guides to Real-World Applications - Google Cloud's Dataflow offers a range of solutions for real-time data processing. These include machine learning and generative AI, ETL and integration, log replication and analytics, marketing intelligence, and clickstream analytics. Each solution guide provides an overview, detailed sketch, and link to a comprehensive guide with code samples and best practices. With Dataflow's scalability, flexibility, and reliability, developers can build real-time solutions efficiently.
Billing Cloud Dataflow Data Analytics Official Blog Streaming Sept. 9, 2024Cut costs and boost efficiency with Dataflow's new custom source reads - Dataflow's new custom source reads feature helps cut costs and boost efficiency in streaming environments by better distributing workloads and proactively relieving overwhelmed workers with load balancing.
Airflow Cloud Composer Data Analytics Official Blog Streaming Aug. 26, 2024Scalable alerting for Apache Airflow to improve data orchestration reliability and performance - This guide reviews the hierarchy of alerting on Cloud Composer and various alerting options available to Google Cloud engineers using Cloud Composer and Apache Airflow.
Cloud Dataflow Data Analytics GCP Experience Official Blog Streaming Aug. 19, 2024Yahoo compares Dataflow vs. self-managed Apache Flink for two streaming use-cases - Yahoo compared the cost and performance of Apache Flink in a self-managed environment and Google Cloud Dataflow for two streaming use cases: writing Avro to Parquet and data enrichment and calculation. Dataflow was found to be around 1.5 - 2 times more cost-effective than Flink, primarily due to the Streaming Engine's ability to handle heavy computations, resulting in fewer required vCPUs and more consistent throughput.
Data Analytics Official Blog Streaming Aug. 19, 2024Try the new Managed Service for Apache Kafka and take cluster management off your todo list - Google Cloud has launched a new Managed Service for Apache Kafka, which simplifies the process of running an Apache Kafka cluster. The service takes care of infrastructure management, security, networking, and scaling, allowing users to focus on building and running their applications. It offers built-in security features, automated network design, and flexible sizing options.
Airflow Cloud Composer Data Analytics Official Blog Streaming Aug. 12, 2024Announcing Apache Airflow operators for Google generative AI - Apache Airflow now has operators to interact with Vertex AI's generative models. These operators enable the integration of Vertex AI's generative models into data pipelines orchestrated by Apache Airflow and Cloud Composer.
BigQuery Data Analytics Official Blog Streaming Aug. 12, 2024Real-time in no time: Introducing BigQuery continuous queries for up-to-the-minute insights - BigQuery continuous queries, now available in preview, enables real-time data analysis and event-driven processing using SQL. It simplifies real-time pipelines, unlocks AI use cases, streamlines reverse ETL, and provides scalability and performance. With BigQuery continuous queries, businesses can gain real-time insights, make informed decisions, and deliver exceptional customer experiences.
Cloud Load Balancing Infrastructure Networking Streaming Aug. 5, 2024Load Balancing Blitz — data pipeline - This blog post explores a near real-time data pipeline to gather metrics for a demo game called Load Balancing Blitz. Pub/Sub, BigQuery, and Looker were used to ingest, process, and visualize data in real-time.
Airflow Cloud Composer Data Analytics Official Blog Streaming July 29, 2024Understanding Airflow DAG and task concurrency on Google Cloud Composer - Airflow DAG and task concurrency are crucial for optimizing Cloud Composer performance. This guide provides comprehensive insights into concurrency settings across four levels: Composer environment, Airflow installation, DAG, and task. By understanding these settings, you can ensure efficient resource utilization, scalability, and fault tolerance in your data pipelines.
Data Analytics Databases Datastream Official Blog Streaming July 29, 2024Datastream’s SQL Server source is generally available - Datastream, a serverless change data capture (CDC) and replication service, now supports SQL Server as a source for replicating data to BigQuery, Cloud Storage, and other Google Cloud destinations. Key enhancements include change tables CDC, stream recovery, gcloud API and Terraform support, and server-side SSL/TLS encryption.
Analytics Hub Cloud Pub/Sub Data Analytics Official Blog Streaming July 15, 2024Share your streaming data with Pub/Sub topics in Analytics Hub - Analytics Hub now supports sharing Pub/Sub topics, enabling organizations to curate, share, and monetize their streaming data assets. By leveraging Analytics Hub Exchanges and Listings, businesses can logically categorize and group sets of Pub/Sub topics and provision access at scale.
Cloud Dataproc Data Analytics Official Blog Streaming July 15, 2024Deployment patterns for Dataproc Metastore on Google Cloud - This blog post explores four DPMS deployment patterns: a single centralized multi-regional DPMS, centralized metadata federation with per-domain DPMS, decentralized metadata federation with per-domain DPMS, and ephemeral metadata federation. Each pattern has its own advantages and disadvantages, and the best choice for an organization will depend on its specific needs and requirements.
Data Analytics Datastream Official Blog Streaming July 8, 2024Announcing new stream recovery capabilities for Datastream - Datastream stream recovery enables quick resumption of data replication with minimal to no data loss in scenarios like database failovers or network outages.
Data Analytics Datastream Official Blog Streaming June 24, 2024Simplify historical data tracking in BigQuery with Datastream's append-only CDC - Datastream's append-only mode simplifies change data capture by preserving every change as a new row in your target BigQuery table. It offers cost efficiency, improved data accuracy, and real-time insights. With append-only mode, businesses can maintain a historical record of changes, track data modifications, and gain deeper insights from their data.
Cloud Dataflow Data Analytics Official Blog Streaming June 10, 2024Boost developer productivity with new pipeline validation capabilities in Dataflow - Dataflow pipeline validation is now generally available. It performs dozens of checks to ensure that your batch or streaming job is error-free and can run successfully.
BigQuery Cloud Dataflow Official Blog Streaming June 3, 2024Accelerating CDC insights with Dataflow and BigQuery - This post covers how to use BigQuery’s new CDC capability in Dataflow along with the new Dataflow at-least-once streaming mode to simplify your CDC pipeline and reduce costs.
AWS Cloud Pub/Sub Data Analytics Official Blog Streaming June 3, 2024Easily stream data from AWS Kinesis to Google Cloud with Pub/Sub import topics - Pub/Sub import topics enable streaming ingestion into BigQuery from external sources, with the first supported external source being Amazon Kinesis Data Streams. Import topics provide a simplified way to ingest data from Amazon Kinesis Data Streams directly into Pub/Sub, reducing the complexity of setting up data pipelines between clouds. Once the connection is established, Amazon Kinesis producers can be gradually migrated to Pub/Sub publishers. Data from Amazon Kinesis Data Streams can be routed to BigQuery using BigQuery subscriptions, and Pub/Sub autoscales to adapt to changes in the Amazon Kinesis data stream.
Cloud Dataflow Data Analytics Official Blog Streaming May 27, 2024More flexibility for your Dataflow jobs with new controls for latency versus cost - Dataflow Streaming Engine users can now choose between lower peak latency or lower streaming costs for their workloads by adjusting the autoscaling utilization hint value. The autoscaling hint value can be set to a higher or lower value using a Dataflow service option. Dataflow’s autoscaling UI provides insights on when it’s worth adjusting the autoscaling behavior and additional dashboards and metrics to monitor the impact of changes.
Data Analytics Official Blog Streaming May 27, 2024Google Data Cloud innovations for continuous real-time intelligence - Google Cloud offers innovations for continuous real-time intelligence, enabling organizations to harness real-time analytics and make informed decisions. With Dataflow, BigQuery, and Apache Kafka for BigQuery, enterprises can leverage streaming infrastructure for visibility, predictions, and activation. Customers like Spotify, Puma, Compass, and Tyson Foods have achieved significant business impact using Google Cloud's data, AI, and real-time solutions.
Cloud Dataflow Official Blog Streaming May 20, 2024No work items left unturned: How Dataflow mitigates stragglers
Useful Links
Contact
Třebanická 183
Prague, Czech Republic
Phone: +420 777 283 075
Email: [email protected]