How Apache Pulsar Functions are Enablers for ML and Event Stream Processing Apache Pulsar functions allow programmers to script automated data processing of event streams and messages on cloud TPU/GPU hardware for AI/ML/DL functionality support using custom-written code in Java, Python, or Go. Many developers are also using Apache Pulsar for Serverless computing implementation after […]
Why Is Geo-Replication Functionality Important For Apache Pulsar Users? Most companies are either already on a public cloud, or planning on moving to a public cloud. Cloud architects want to design an application that has zero downtime, or 99.999% availability. Let’s take a look at a use case to better understand this idea. For example, […]
Why Event Streaming is a Key Enabler of ML / AI Applications Event Streaming Concisely Defined With analytics applications of all flavors now rooted in machine learning and AI, event streaming is increasingly important. What is the most concise definition of event streaming? Real-time data streams flowing from financial transactions, for example, contain events such […]
Example Python Functions With Apache Pulsar Brief Intro One of the most exciting features of Apache Pulsar is Pulsar Functions. The general premise is, if you have a series of messages/events, you can apply arbitrary logic against each message in a serverless and stateful way. This feels similar to how AWS Lambda feels, where you […]
The PubSub Messaging Concept of Apache Pulsar The publish-subscribe model of messaging is a key aspect of event-driven architecture in software development, where complex arrays of filters, tags, and keywords are used to deliver text, image, video, & other content to users that are customized by device, display, and personalized characteristics. The PubSub messaging concept […]
Pulsar with Pandio: Don’t Use Apache Kafka Real-time, continuous data feeds that power systems and applications are increasingly critical for businesses and organizations of all sizes today. For nearly a decade, many organizations have relied on Apache Kafka, an open-source distributed software platform, to handle those data feeds. More than 30% of the Fortune 500 […]
Kafka vs Pulsar: Why Pulsar Outperforms Kafka Every Time Apache Kafka and Pulsar both are used within the Hadoop ecosystem, as well as hundreds of other ecosystems, for event processing in real-time data streams for “Big Data” applications operating at the highest levels of web/mobile traffic requirements. In benchmark testing conducted by independent researchers using […]
What Is Event Streaming and Why Is It Critical for Big Data Applications? The evolution of the Hadoop platform spans 20 years of computer science in enterprise corporations, advancing the major paradigms of search, cloud computing, and social networking in datacenter research & software development. The core Hadoop project now consists of the fundamental HDFS, […]