Over the last 10 years, there’s been an explosion in the volume of both structured and unstructured data sets. Every year, more and more data is being generated, stored, analyzed, and mined for business insights. This growth has driven the adoption of alternative approaches to managing business data. Amazon Simple Storage Service (S3) has become the industry standard for the management of large volumes of data. In this post, I’ll share how to create an external Oracle table using the Oracle DBMS_CLOUD package, and then run a query against a comma-separated value (CSV) file located in a Pure Storage ® FlashBlade ® S3 Object Storage bucket.
For the purpose of this post, I’ve pre-created a FlashBlade Object Store bucket, uploaded a CSV file, set up the Oracle DBMS_CLOUD package, and configured the S3 credentials within my Oracle 21c database.
List Objects
Let’s start by confirming the size and number of rows within our CSV file called airports.csv .
Below, we can see that the file has 72,911 rows and is 4,965,317 bytes (4.7M).
% ls – l airports . csv
– rw – r — r — @ 1 rekins & nbsp ; staff & nbsp ; 4965317 & nbsp ; 3 Aug 15 : 44 airports . csv
& nbsp ;
% wc – l airports . csv
& nbsp ; & nbsp ; & nbsp ; 72911 airports . csv
We can confirm the object is accessible with the aws s3api utility and list-objects option, for example.
$ aws s3api list – objects — bucket ora – bucket — query ‘Contents[].{Key: Key, Size: Size}’ — endpoint – url https : //s3-fbstaines02.uklab.purestorage.com –profile fbstaines02 –ca-bundle /home/oracle/dbc/commonstore/wallets/ssl/pureCA.pem
[
& nbsp ; & nbsp ; & nbsp ; & nbsp ; {
& nbsp ; & nbsp ; & nbsp ; & nbsp ; & nbsp ; & nbsp ; & nbsp ; & nbsp ; “Key” : “airports.csv” ,
& nbsp ; & nbsp ; & nbsp ; & nbsp ; & nbsp ; & nbsp ; & nbsp ; “Size” : 4965317
& nbsp ; & nbsp ; & nbsp ; & nbsp ; }
]
Create Table
Log on to an Oracle 19c/21c pluggable database (PDB) and create an external table using the dbms_cloud package.
The example below creates an external table called airports_fb using the airports.csv file stored in an S3 Object Storage bucket called ora-bucket.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
begin
& nbsp ; & nbsp ; dbms_cloud . create_external_table (
& nbsp ; & nbsp ; & nbsp ; & nbsp ; table_name & nbsp ; =& gt ; ‘airports_fb’ ,
& nbsp ; & nbsp ; & nbsp ; & nbsp ; credential_name =& gt ; ‘fb_cred’ ,
& nbsp ; & nbsp ; & nbsp ; & nbsp ; file_uri_list & nbsp ; =& gt ; ‘<a href=”https://s3-fbstaines02.amazonaws.com/ora-bucket/airports.csv”>https://s3-fbstaines02.uklab.purestorage.com/ora-bucket/airports.csv</a>’ ,
& nbsp ; & nbsp ; & nbsp ; & nbsp ; column_list & gt ; =& gt ; ‘ident varchar2(20),
type varchar2(20),
name varchar2(20),
elevation_ft number(6),
continent varchar2(20),
iso_country varchar2(20),
iso_region varchar2(20),
municipality varchar2(20)’ ,
& nbsp ; & nbsp ; & nbsp ; & nbsp ; format & nbsp ; & nbsp ; & nbsp ; =& gt ; json_object ( ‘type’ value ‘csv’ , ‘skipheaders’ value ‘1’ , ‘ignoremissingcolumns’ value ‘true’ , ‘conversionerrors’ value ‘store_null’ , ‘rejectlimit’ value ‘unlimited’ )
& nbsp ; ) ;
end ;
/
Query Table
Now that we’ve created an external S3 table, we can confirm the table definition using the SQLPlus desc command . VARCHAR2(20)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 – Production
Version 21.6.0.0.0
& nbsp ;
SQL & gt ; desc airports_fb ;
& nbsp ; Name & nbsp ; Null ? & nbsp ; & nbsp ; Type
& nbsp ; — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — –
& nbsp ;
& nbsp ; IDENT & nbsp ; & nbsp ; VARCHAR2 ( 20 )
& nbsp ; TYPE & nbsp ; & nbsp ; VARCHAR2 ( 20 )
& nbsp ; NAME & nbsp ; & nbsp ; VARCHAR2 ( 20 )
& nbsp ; ELEVATION_FT & nbsp ; & nbsp ; NUMBER ( 6 )
& nbsp ; CONTINENT & nbsp ; & nbsp ; VARCHAR2 ( 20 )
& nbsp ; ISO_COUNTRY & nbsp ; & nbsp ; VARCHAR2 ( 20 )
& nbsp ; ISO_REGION & nbsp ; & nbsp ; VARCHAR2 ( 20 )
& nbsp ; MUNICIPALITY & nbsp ; & nbsp ; VARCHAR2 ( 20 )
& nbsp ;
SQL & gt ;
We can check the number of rows by using count , for example.
SQL & gt ; select count ( * ) from airports_fb ;
& nbsp ; & nbsp ; COUNT ( * )
— — — — —
& nbsp ; & nbsp ; & nbsp ; & nbsp ; & nbsp ; 72911
SQL & gt ;
And run a query against the external table, for example, to return all airports that have London in their name.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
SQL & gt ; select NAME , ISO_REGION , MUNICIPALITY from AIRPORTS_FB where NAME like ‘%London%’ ;
NAME & nbsp ; & nbsp ; ISO_REGION & nbsp ; MUNICIPALITY
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
London Airport & nbsp ; & nbsp ; CA – ON & nbsp ; London
London Luton Airport GB – ENG & nbsp ; London
London City Airport & nbsp ; GB – ENG & nbsp ; London
London Heliport & nbsp ; & nbsp ; & nbsp ; GB – ENG & nbsp ; London
London Colney & nbsp ; & nbsp ; GB – ENG
London Gliding Club & nbsp ; GB – ENG & nbsp ; Dunstable
New London Airport & nbsp ; US – VA . Forest
7 rows selected .
SQL & gt ;
Explain Plan
We can confirm our query is using an external table by creating an explain plan for the sql, for example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
SQL & gt ; EXPLAIN PLAN FOR
& nbsp ; & nbsp ; 2 & nbsp ; select NAME , ISO_REGION , MUNICIPALITY from AIRPORTS_FB where NAME like ‘%London%’ order by NAME ;
& nbsp ;
Explained .
& nbsp ;
SQL & gt ; select plan_table_output from
& nbsp ; & nbsp ; 2 & nbsp ; table ( dbms_xplan . display ( ‘plan_table’ , null , ‘basic’ ) ) ;
& nbsp ;
PLAN_TABLE_OUTPUT
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
Plan hash value : 587737101
& nbsp ;
— — — — — — — — — — — — — — — — — — — — — — — — — — — —
| Id & nbsp ; | Operation | Name & nbsp ; & nbsp ; & nbsp ; & nbsp ; |
— — — — — — — — — — — — — — — — — — — — — — — — — — — —
| & nbsp ; 0 | SELECT STATEMENT | & nbsp ; & nbsp ; & nbsp ; |
| & nbsp ; 1 | & nbsp ; PX COORDINATOR | & nbsp ; & nbsp ; & nbsp ; |
| & nbsp ; 2 | & nbsp ; PX SEND QC ( ORDER ) | : TQ10001 & nbsp ; & nbsp ; |
| & nbsp ; 3 | & nbsp ; & nbsp ; SORT ORDER BY | & nbsp ; & nbsp ; & nbsp ; |
| & nbsp ; 4 | & nbsp ; & nbsp ; PX RECEIVE | & nbsp ; & nbsp ; & nbsp ; |
| & nbsp ; 5 | & nbsp ; & nbsp ; & nbsp ; PX SEND RANGE | : TQ10000 & nbsp ; & nbsp ; |
& nbsp ;
PLAN_TABLE_OUTPUT
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
| & nbsp ; 6 | & nbsp ; & nbsp ; & nbsp ; PX BLOCK ITERATOR | & nbsp ; & nbsp ; & nbsp ; |
| & nbsp ; 7 | & nbsp ; & nbsp ; & nbsp ; & nbsp ; EXTERNAL TABLE ACCESS FULL | AIRPORTS_FB |
— — — — — — — — — — — — — — — — — — — — — — — — — — — —
& nbsp ;
14 rows selected .
& nbsp ;
SQL & gt ;
From the explain plan, we can see that Oracle has used an external table.
Applications and Databases
Simplify management, boost performance, and dramatically cut costs
Table Update
From the output above, we can see the record for London Colney is missing a value for municipality. What happens if we try to update a row?
SQL & gt ; update AIRPORTS_FB set MUNICIPALITY = ‘London’ where NAME = ‘London Colney’ ;
update AIRPORTS_FB set MUNICIPALITY = ‘London’ where NAME = ‘London Colney’
& nbsp ; & nbsp ; & nbsp ; & nbsp ; & nbsp ; & nbsp ; & nbsp ; *
ERROR at line 1 :
ORA – 30657 : operation not supported on external organized table & lt ;
Or, attempt to delete a row?
SQL & gt ; delete from AIRPORTS_FB where NAME = ‘London Colney’ ;
delete from AIRPORTS_FB where NAME = ‘London Colney’
& nbsp ; & nbsp ; & nbsp ; & nbsp ; & nbsp ; & nbsp ; & nbsp ; & nbsp ; & nbsp ; & nbsp ; & nbsp ; & nbsp ; *
ERROR at line 1 :
ORA – 30657 : operation not supported on external organized table
As expected, they both fail as external tables don’t support data manipulation language (DML). They’re designed to be read-only, therefore any amendments need to be performed externally.
Summary on Oracle
In this post, I’ve demonstrated how you can access and query external data stored in FlashBlade S3 Object Storage buckets directly from an Oracle 19c/21c database using the Oracle DBMS_CLOUD package.