Systems | Development | Analytics | API | Testing

CDP on Azure: Harnessing the Power of Data Flow and Event Processing

Data is being created at an ever increasing rate and generating insights through event streams has become a critical function for businesses. How can we process this data flowing in the enterprise, evaluate, enrich and transform it, all in real time to enable fast analytics to support intelligent decision making? Join us for this session where we will look at how we can use the elastic nature of Azure to scale Data Flows and perform SQL operations in realtime on streaming data from a variety of sources.

Future of Data Meetup: Future of data and analytics in the Hybrid & Multi Cloud

The most valuable and transformative business use cases require multiple analytics workloads and data science tools and machine learning algorithms to run against the same diverse data sets. It’s how the most innovative enterprises unlock value from their data. Turning data into useful insights is not easy, to say the least. The workloads need to be optimised for hybrid and multi-cloud environments, delivering the same data management capabilities across bare metal, private and public clouds. In this session, we will discuss how businesses can leverage the combination of best-in-class software and public cloud to help businesses turn raw data into actionable insights, without the overheads and without compromising performance, security and governance.

Introducing Cloudera DataFlow for the Public Cloud

With the rise of streaming data (or, data-in-motion), companies must figure out how to deliver high-scale data ingestion, transformation, and management. In this session, you’ll see how Cloudera Data Platform’s (CDP) new DataFlow service provides real-time data movement capabilities to address hybrid cloud use cases.

Developing a Basic Web Application using an Operational DB on CDP

In this video, you'll see a simple demo on how you can build a web application on top of a Cloudera Operational Database. We'll leverage the Apache Phoenix integration to easily write SQL statements against our database and use the python flask library to power the back end API calls. The web application will be hosted within Cloudera Machine Learning, showcasing some of the benefits of having your data within a hybrid data platform.

Processing DICOM Files With Spark on CDP Hybrid Cloud

In this video, you will see how you can use PySpark to process medical images from an MRI and convert them from DICOM format to PNG. The data is read from and written to AWS S3 and we leverage numpy and the pydicom libraries to do the data transformation. We are using data from the "RSNA-MICCAI Brain Tumor Radiogenomic Classification" Kaggle competition but this approach can be used for general purpose DICOM processing.