Managed table and external table in spark

8/6/2021
All Articles

#Managed #table and #external_table in #spark

Managed table and external table in spark

Table Types in Spark or hive table

Since starting native DDL supports (in Spark 2.0),
It is allow us to change location of table data .
You can check  out the table type by the SparkSession API spark.catalog.getTable .

Managed Table

Spark will delete both the table data in the warehouse and the metadata in the meta-store,
LOCATION is not mandatory for EXTERNAL tables.
The location of data files is {current_working_directory}
below is example of manage table



spark.sql(CREATE EXTERNAL TABLE developer (id int  , name String) ')

//OR in delta format
batched_orders.write.format("delta").partitionBy('submitted_yyyy_mm').mode("overwrite").saveAsTable(orders_table)


External Table

Spark will delete only  metadata in the meta-store.
LOCATION is mandatory for EXTERNAL tables.
The location of  data files is user define
.below is example of external table :


// The created tables are EXTERNAL
spark.sql(CREATE TABLE developer (id int , name String) LOCATION '/tmp/tables/developer')
spark.sql(CREATE EXTERNAL TABLE developer (id int  , name String)  LOCATION '/tmp/tables/developer') 


How to create Managed table using delta

 

 

batched_orders.write.format("delta").partitionBy('submitted_yyyy_mm').mode("overwrite").saveAsTable(orders_table)

How to create EXTERNAL table using delta

 

create  table  orders using delta location " path" 

 

 

How to create Managed table using parquet


// The created tables are MANAGED.

// The location of developer data files is {current_working_directory}/spark-warehouse/table5    
spark.sql(CREATE TABLE developer (id int  , name String) USING PARQUET);
 

How to create Managed table using parquet


// The created tables are EXTERNAL
spark.sql(CREATE TABLE developer (id int  , name String)) USING PARQUET OPTIONS('path'='/tmp/tables/table6') ;,

 

Conclusion of Articla Table  Types  in Spark

 Here We  learn how to create mnage table and external table in spark and its implementation with 

parquet and delta format ,It also support other format as well.

Hope you like it, Please subscribe our instagram and facebook account for more update .

 

More Related Article on hive external or internal table ,please check below link :

Article