Systems | Development | Analytics | API | Testing

How to Ensure Supply Chain Security for AI Applications

Machine Learning (ML) is at the heart of the boom in AI Applications, revolutionizing various domains. From powering intelligent Large Language Model (LLM) based chatbots like ChatGPT and Bard, to enabling text-to-AI image generators like Stable Diffusion, ML continues to drive innovation. Its transformative impact advances multiple fields from genetics to medicine to finance. Without exaggeration, ML has the potential to profoundly change lives, if it hasn’t already.

Streaming Data Pipeline Development

This Meetup will cover how to build applications from some common use cases and highlight tips, tricks, best practices and patterns In this interactive session, Tim will lead participants through how to best build streaming data pipelines. He will cover how to build applications from some common use cases and highlight tips, tricks, best practices and patterns. He will show how to build the easy way and then dive deep into the underlying open source technologies including Apache NiFi, Apache Flink, Apache Kafka and Apache Iceberg.

HDFS Snapshot Best Practices

The snapshots feature of the Apache Hadoop Distributed Filesystem (HDFS) enables you to capture point-in-time copies of the file system and protect your important data against corruption, user-, or application errors. This feature is available in all versions of Cloudera Data Platform (CDP), Cloudera Distribution for Hadoop (CDH) and Hortonworks Data Platform (HDP).

The Art of Data Leadership | A discussion with Chief Digital Officer, Ray Kunik

Our Chief Data & Analytics Officer, Shayde Christian, sits down for a buzzworthy conversation with Chief Digital Officer Raymond L. Kunik Jr. to discuss the “other” CDO role, the science behind work-life integration, the impact and applications of #AI, and its correlation with a pretty sweet hobby.

Why Reinvent the Wheel? The Challenges of DIY Open Source Analytics Platforms

In their effort to reduce their technology spend, some organizations that leverage open source projects for advanced analytics often consider either building and maintaining their own runtime with the required data processing engines or retaining older, now obsolete, versions of legacy Cloudera runtimes (CDH or HDP).