Managed table and external table in spark
#Managed #table and #external_table in #spark
Since starting native DDL supports (in Spark 2.0),
It is allow us to change location of table data .
You can check out the table type by the SparkSession API spark.catalog.getTable .
Spark will delete both the table data in the warehouse and the metadata in the meta-store,
LOCATION is not mandatory for EXTERNAL tables.
The location of data files is {current_working_directory}
below is example of manage table
spark.sql(CREATE EXTERNAL TABLE developer (id int , name String) ')
//OR in delta format
batched_orders.write.format("delta").partitionBy('submitted_yyyy_mm').mode("overwrite").saveAsTable(orders_table)
Spark will delete only metadata in the meta-store.
LOCATION is mandatory for EXTERNAL tables.
The location of data files is user define
.below is example of external table :
// The created tables are EXTERNAL
spark.sql(CREATE TABLE developer (id int , name String) LOCATION '/tmp/tables/developer')
spark.sql(CREATE EXTERNAL TABLE developer (id int , name String) LOCATION '/tmp/tables/developer')
batched_orders.write.format("delta").partitionBy('submitted_yyyy_mm').mode("overwrite").saveAsTable(orders_table)
create table orders using delta location " path"
// The created tables are MANAGED.
// The location of developer data files is {current_working_directory}/spark-warehouse/table5
spark.sql(CREATE TABLE developer (id int , name String) USING PARQUET);
// The created tables are EXTERNAL
spark.sql(CREATE TABLE developer (id int , name String)) USING PARQUET OPTIONS('path'='/tmp/tables/table6') ;,
Here We learn how to create mnage table and external table in spark and its implementation with
parquet and delta format ,It also support other format as well.
Hope you like it, Please subscribe our instagram and facebook account for more update .