Modernizing Data Pipelines using Cloudera Data Platform - Part 1

Data pipelines are in high demand in today’s data-driven organizations. As critical elements in supplying trusted, curated, and usable data for end-to-end analytic and machine learning workflows, the role of data pipelines is becoming indispensable. To keep up, data pipelines are being vigorously reshaped with modern tools and techniques.

Apache Ozone Metadata Explained

Apache Ozone is a distributed object store built on top of Hadoop Distributed Data Store service. It can manage billions of small and large files that are difficult to handle by other distributed file systems. As an important part of achieving better scalability, Ozone separates the metadata management among different services: Ozone Manager (OM) service manages the metadata of the namespace such as volume, bucket and keys.

7 Data Migration Best Practices and Tools

Data migration seems simple from a high-level point of view. After all, you’re simply moving data between two or more locations. In practice, however, migrating data can be one of your IT department’s trickiest data management initiatives. According to LogicWorks, 90 percent of CIOs in charge of data migrations moving from on-premises to the cloud have encountered problems during this process, with 75 percent missing planned deadlines.

PII Data Privacy: How to Stay Compliant

When people share their personal information with an organization, they’re performing an act of trust. They trust you to keep their data safe from hackers, and they trust you to use their data only for legitimate purposes. While many organizations honor this trust, others do not. As a result, governments worldwide are rushing to pass data protection legislation that puts the power back in the hands of people.