image_pdfimage_print

Over the last 10 years, there’s been an explosion in the volume of both structured and unstructured data sets. Every year, more and more data is being generated, stored, analyzed, and mined for business insights.  This growth has driven the adoption of alternative approaches to managing business data. Amazon Simple Storage Service (S3) has become the industry standard for the management of large volumes of data. In this post, I’ll share how to create an external Oracle table using the Oracle DBMS_CLOUD package, and then run a query against a comma-separated value (CSV) file located in a Pure Storage® FlashBlade® S3 Object Storage bucket. 

For the purpose of this post, I’ve pre-created a FlashBlade Object Store bucket, uploaded a CSV file, set up the Oracle DBMS_CLOUD package, and configured the S3 credentials within my Oracle 21c database. 

List Objects

Let’s start by confirming the size and number of rows within our CSV file called airports.csv

Below, we can see that the file has 72,911 rows and is 4,965,317 bytes (4.7M).

We can confirm the object is accessible with the aws s3api utility and list-objects option, for example.

Create Table

Log on to an Oracle 19c/21c pluggable database (PDB) and create an external table using the dbms_cloud package.

The example below creates an external table called airports_fb using the airports.csv file stored in an S3 Object Storage bucket called ora-bucket.

Query Table

Now that we’ve created an external S3 table, we can confirm the table definition using the SQLPlus desc command.   VARCHAR2(20)

We can check the number of rows by using count, for example.

And run a query against the external table, for example, to return all airports that have London in their name.

Explain Plan

We can confirm our query is using an external table by creating an explain plan for the sql, for example:

From the explain plan, we can see that Oracle has used an external table. 

Table Update

From the output above, we can see the record for London Colney is missing a value for municipality. What happens if we try to update a row?

Or, attempt to delete a row?

As expected, they both fail as external tables don’t support data manipulation language (DML). They’re designed to be read-only, therefore any amendments need to be performed externally.

Summary on Oracle

In this post, I’ve demonstrated how you can access and query external data stored in FlashBlade S3 Object Storage buckets directly from an Oracle 19c/21c database using the Oracle DBMS_CLOUD package.