Systems | Development | Analytics | API | Testing

The Great Data Revolution Is Here, and Qlik Customers Are at the Heart of It

Data – the amount we create, how we create it, how it is accessed (think both people and Artificial Intelligence/machines), and how we use it to inform, propel and influence everyone and everything is one of the biggest challenges and opportunities we face in our lifetime. And it’s driving enormous change.

The Data Chief Live: Beyond the Buzz in Data Mesh, Lakehouse, Data Warehouse

Join The Data Chief Live on October 7 to go beyond the buzz on all things data mesh, lakehouse, and data warehouse. Gain clarity on what is hype, what is real, and how others are delivering business value faster with modern data platforms and processes. You'll hear live from Darren Pedroza, VP Enterprise Data and Analytics, First Command Financial Services, Inc., Zhamak Dehghani, Director of Emerging Technologies at Thoughtworks & author of The Data Mesh, Chris D'Agostino, Global Field CTO Databricks & me.

Processing DICOM Files With Spark on CDP Hybrid Cloud

In this video, you will see how you can use PySpark to process medical images from an MRI and convert them from DICOM format to PNG. The data is read from and written to AWS S3 and we leverage numpy and the pydicom libraries to do the data transformation. We are using data from the "RSNA-MICCAI Brain Tumor Radiogenomic Classification" Kaggle competition but this approach can be used for general purpose DICOM processing.

Talend iPaaS momentum grows. Talend recognized in the 2021 Gartner Magic Quadrant for Enterprise iPaaS

As organizations continue to embrace cloud-based computing as the cornerstone of their digital transformation, the integration platform as a service (iPaaS) has become a critical component of their integration environments. An iPaaS solution simplifies the integration of data, applications, and systems, whether in the cloud or on-premises, through unified support for API, application, data, and B2B integration styles.

Snowflake BUILD 2021: Opening Keynote

Did you miss the Snowflake Build Data Cloud Dev Summit keynote or—like it so much you want to watch it again? Well, you’re in the right place. Join our SVP of Engineering and Support, Greg Czajkowski, as he kicks off this year’s Snowflake BUILD, shares his vision for the Data Cloud and the opportunity it presents for developers, and features unique applications built on Snowflake. Greg will be joined on stage by Chris Child, Sr. Director of Product, who will highlight the recently launched Powered by Snowflake partner program, including interviews with SK, VideoAmp, and Human who will share their Snowflake story. Chris will also be joined by Snowflake product experts to make exciting announcements and demo cool new products for developers.

Struggling to Manage your Multi-Tenant Environments? Use Chargeback!

If your organization is using multi-tenant big data clusters (and everyone should be), do you know the usage and cost efficiency of resources in the cluster by tenants? A chargeback or showback model allows IT to determine costs and resource usage by the actual analytic users in the multi-tenant cluster, instead of attributing those to the platform (“overhead’) or IT department. This allows you to know the individual costs per tenant and set limits in order to control overall costs.

An Introduction to Ranger RMS

Cloudera Data Platform (CDP) supports access controls on tables and columns, as well as on files and directories via Apache Ranger since its first release. It is common to have different workloads using the same data – some require authorizations at the table level (Apache Hive queries) and others at the underlying files (Apache Spark jobs). Unfortunately, in such instances you would have to create and maintain separate Ranger policies for both Hive and HDFS, that correspond to each other.

Our reflections on the 2021 Gartner Magic Quadrant for Data Quality Solutions

Success for any business starts with data that is easily discoverable, understandable, and of value to the people who need it. We call this type of data “healthy data.” You should look at a wide set of measures and metrics to determine whether data is healthy or not, but at the core of all healthy data is a high level of quality.