2020 has begun with a lot of developments in analytics and Data Management for Oracle. The latest OpenWorld convention held this month in London brings us various new products: Oracle Data Science, Oracle Data Flow and what we will be presenting to you today: Oracle Cloud Infrastructure Data Catalog.
The creation of data catalogues is becoming increasingly more necessary in large organisations where different and heterogeneous sources of data are managed. On-premise, cloud, hybrid cloud and multi-cloud data. All of this makes it more and more difficult to manage and control all of our information.
To help organisations organise their “data assets”, Oracle has created Oracle Data Catalog, included for free in Oracle Cloud Infrastructure; that is, we will be able to use this service with no additional cost.
With Oracle Data Catalog we will be able to centralise all of our data sources, allowing our users, data scientists, CDO, etc. to understand, classify and “track” their data, without worrying about where our storage is based: On-premise, cloud and Big Data systems databases. Oracle Data Catalog centralises all of the data sources, eliminating the complexity of accessing systems such as Hadoop or having to access large databases via SQL.
Oracle Data Catalog connects to our data sources and imports all meta data (technical, contributed through business or operational.) of the said “data sources”.
The supported sources are the following:
- Oracle Cloud Infrastructure Object Storage (CSV, Excel, ORC, Avro, Parquet, JSON)
- Oracle Database (Cloud and On-Premise)
- Oracle Autonomous Transaction Processing
- Oracle Autonomous Data Warehouse
- Oracle MySQL (Cloud and On-Premise)
- Hive (running on Oracle Cloud Infrastructure)
- Kafka (running on Oracle Cloud Infrastructure)
Oracle Data Catalog also provides us with many other functionalities: API Rest to integrate all the functionalities with third-party applications, programmers to launch exploration of our regular and unattended Datasets, etc.