Using Exadata flash based grid disks for Data Pump import

I’ve just been experimenting with using flash based grid disks for staging data pump dump files, with fairly successful results.

First I created a non-partitioned 55GB table. The table was based on OE.ORDERS from the sample schema but I merged all of the partitions and dropped all of the constraints and indexes. This is now a simple table in a dedicated schema called MW. I then took an export of this one table which we’ll use for the import soon, and then dropped the table from my schema.

Note that in this case, I’m using a parallism of 1 – in a real world scenario performance would benefit from parallelism across multiple RAC instances. Also note that this testing was performed on an X3-2 eighth rack – results will improve scale up greatly on the larger Exadata racks.

First test was importing from /u01 on the compute node. This took 6 minutes.

Next I set up DBFS with a store located on +DATA, copied the dump file to /dbfs_direct and ran the import from there. The import took 14 minutes.

Now I wanted to try staging the dump file on flash based storage. To do this, I dropped all of the flash cache of all storage servers and built grid disks on the flash cards instead. Obviously, you’re not going to want to do this on a production system once you have systems running on it.


alter flashcache all flush
list celldisk attributes name,flushstatus,flusherror
drop flashcache
create griddisk all flashdisk prefix='FLASH'

Once this was complete on all cells, I can create the disk group within ASM.


create diskgroup FLASH_DG high redundancy disk 'o/192.168.131.3/FLASH_FD_03_dm01celadm01',
'o/192.168.131.4/FLASH_FD_03_dm01celadm02',
'o/192.168.131.3/FLASH_FD_01_dm01celadm01',
'o/192.168.131.3/FLASH_FD_14_dm01celadm01',
'o/192.168.131.3/FLASH_FD_12_dm01celadm01',
'o/192.168.131.3/FLASH_FD_11_dm01celadm01',
'o/192.168.131.3/FLASH_FD_05_dm01celadm01',
'o/192.168.131.3/FLASH_FD_08_dm01celadm01',
'o/192.168.131.3/FLASH_FD_09_dm01celadm01',
'o/192.168.131.3/FLASH_FD_10_dm01celadm01',
'o/192.168.131.3/FLASH_FD_13_dm01celadm01',
'o/192.168.131.3/FLASH_FD_07_dm01celadm01',
'o/192.168.131.3/FLASH_FD_00_dm01celadm01',
'o/192.168.131.3/FLASH_FD_04_dm01celadm01',
'o/192.168.131.3/FLASH_FD_02_dm01celadm01',
'o/192.168.131.3/FLASH_FD_15_dm01celadm01',
'o/192.168.131.5/FLASH_FD_13_dm01celadm03',
'o/192.168.131.5/FLASH_FD_03_dm01celadm03',
'o/192.168.131.5/FLASH_FD_01_dm01celadm03',
'o/192.168.131.5/FLASH_FD_15_dm01celadm03',
'o/192.168.131.5/FLASH_FD_02_dm01celadm03',
'o/192.168.131.5/FLASH_FD_10_dm01celadm03',
'o/192.168.131.5/FLASH_FD_14_dm01celadm03',
'o/192.168.131.5/FLASH_FD_08_dm01celadm03',
'o/192.168.131.5/FLASH_FD_06_dm01celadm03',
'o/192.168.131.5/FLASH_FD_05_dm01celadm03',
'o/192.168.131.5/FLASH_FD_11_dm01celadm03',
'o/192.168.131.5/FLASH_FD_09_dm01celadm03',
'o/192.168.131.5/FLASH_FD_00_dm01celadm03',
'o/192.168.131.5/FLASH_FD_12_dm01celadm03',
'o/192.168.131.5/FLASH_FD_04_dm01celadm03',
'o/192.168.131.5/FLASH_FD_07_dm01celadm03',
'o/192.168.131.4/FLASH_FD_12_dm01celadm02',
'o/192.168.131.4/FLASH_FD_07_dm01celadm02',
'o/192.168.131.4/FLASH_FD_15_dm01celadm02',
'o/192.168.131.4/FLASH_FD_14_dm01celadm02',
'o/192.168.131.4/FLASH_FD_00_dm01celadm02',
'o/192.168.131.4/FLASH_FD_09_dm01celadm02',
'o/192.168.131.4/FLASH_FD_13_dm01celadm02',
'o/192.168.131.4/FLASH_FD_06_dm01celadm02',
'o/192.168.131.4/FLASH_FD_05_dm01celadm02',
'o/192.168.131.4/FLASH_FD_10_dm01celadm02',
'o/192.168.131.4/FLASH_FD_08_dm01celadm02',
'o/192.168.131.4/FLASH_FD_11_dm01celadm02',
'o/192.168.131.4/FLASH_FD_02_dm01celadm02',
'o/192.168.131.4/FLASH_FD_04_dm01celadm02',
'o/192.168.131.4/FLASH_FD_01_dm01celadm02',
'o/192.168.131.3/FLASH_FD_06_dm01celadm01'
attribute 'content.type' = 'DATA',
'compatible.rdbms' = '11.2.0.2',
'compatible.asm' = '11.2.0.3',
'AU_SIZE' = '4M',
'cell.smart_scan_capable' = 'TRUE';

I chose high redundancy because I thought this would be a good idea for flash based storage, but you could go with “Normal” too.

A quick look at the ASM free space shows we have a nice 1.5TB of flash based usable space within our new disk group –


NAME FREE_GB_WITH_IMBALANCE
------------------- -----------------------
DATA_DM01 2090.3
DBFS_DG 92.9
RECO_DM01 582.4
FLASH_DG 1488.4

Next I created another DBFS store on the +FLASH_DG disk group, and copied the dump file from /u01 to /dbfs_direct/flashfs.


create bigfile tablespace flash_ts datafile '+FLASH_DG' nologging online permanent extent management local autoallocate segment space management auto;
@?/rdbms/admin/dbfs_create_filesystem FLASH_TS flashfs

The import from here took 8 minutes – almost twice as quick as from normal disk based storage.

I was also interested to see how the import would perform if we used a directory created within ASM directly.


asmcmd
cp /dbfs_direct/flashfs/expdat.dmp +flash_dg
SQL> create directory asm_flash as '+FLASH_DG';
impdp directory=asm_flash logfile=mw_exp:exp.log

This import actually took just 6 minutes. This is interesting, although I’m not sure how practical that is – you would need to stage the dump file somewhere first in order to copy it into ASM with asmcmd.

Once the import is finished you’ll definitely want to return to using the flash storage for Exadata Smart Flash Cache.


@?/rdbms/admin/dbfs_drop_filesystem flashfs
drop tablespace flash_ts;
drop diskgroup flash_dg including contents;
CellCLI> drop griddisk all prefix=FLASH
CellCLI> create flashcache all;

So in summary, import speeds for our test case were as follows –

/u01 local to compute node – 6 minutes. You’re limited somewhat in dump size here, plus you can’t parallelise across multiple nodes. However, its the best option for small imports.

DBFS with hard disk based storage. 14 minutes. Slowest but has the highest capacity and allows parallelism across nodes.

DBFS with flash disk based storage. 8 minutes. This is going to be the fastest option for large imports as long as it fits within the flash space available to you. Further gains to be had by parallelising across nodes.

ASM based database directory. 6 minutes. Any benefits probably negated by the time taken to get the data into ASM in the first place.

Advertisements
Post a comment or leave a trackback: Trackback URL.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: