Skip to main content
Version: latest

Import and Export

CnosDB can export data to local or object storage using SQL COPY INTO, or import data from object storage and local file systems.

Supported file formats include PARQUET, CSV, JSON, currently supported object storage includes AWS S3, Google Cloud Storage, Azure Blob Storage.

Export data

Export to local file

Syntax

COPY INTO <path>
FROM [<database>.]<table_name>
FILE_FORMAT = (
TYPE = {'PARQUET' | 'CSV' | 'JSON'}
[, DELIMITER = '<character>']
[, WITH_HEADER = true | false]
)

Parameter Description

NameDescription
TYPESet the file format to: PARQUET, CSV, JSON, for example: FILE_FORMAT = (TYPE='CSV')
DELIMITEROnly support CSV, set file format, for example: DELIMITER = ','
WITH_HEADEROnly supports CSV, whether it has a header, default is true, example: example: FILE_FORMAT = (TYPE='CSV' WITH_HEADER=true)

Example

Export PARQUET files to local directory
COPY INTO 'file:///tmp/air/'
FROM "air"
FILE_FORMAT = (
TYPE = 'PARQUET'
);
Export the CSV file to the local directory, and set the delimiter to , without setting the header
COPY INTO 'file:///tmp/air'
FROM "air"
FILE_FORMAT = (
TYPE = 'CSV',
DELIMITER = ',',
WITH_HEADER = false
);

Export to AWS S3

Syntax

COPY INTO <s3://<bucket>[/<path>]>
FROM [database.]<table_name>
CONNECTION = (
region = '<string>'
, access_key_id = '<string>'
, secret_key = '<string>'
[, token = '<string>' ]
[, virtual_hosted_style = true | false ]
)
FILE_FORMAT = (
TYPE = {'PARQUET' | 'CSV' | 'JSON'}
[, DELIMITER = '<character>']
[, WITH_HEADER = true | false]
)

Parameter Description

NameDescription
TYPESet the file format to: PARQUET, CSV, JSON, for example: FILE_FORMAT = (TYPE='CSV')
DELIMITEROnly support CSV, set file format, for example: DELIMITER = ','
WITH_HEADEROnly supports CSV, whether it has a header, default is true, example: example: FILE_FORMAT = (TYPE='CSV' WITH_HEADER=true)
regionAWS Region Code
access_key_idAccess key ID
secret_keySecret Key
token(Optional) Temporary Authorization Token
virtual_hosted_styleEnable Virtual-Hosted Style access mode by default to true

Example

Export data to CSV and store it in AWS S3, specify delimiter as , , and set table header
COPY INTO 's3://tmp/air'
FROM "air"
CONNECTION = (
region = 'us‑east‑1',
access_key_id = '****************',
secret_key = '****************'
)
FILE_FORMAT = (
TYPE = 'CSV',
DELIMITER = ',',
WITH_HEADER = true
);
Export data to PARQUET and store it in AWS S3
COPY INTO 's3://tmp/air'
FROM "air"
CONNECTION = (
region = 'us‑east‑1',
access_key_id = '****************',
secret_key = '****************'
)
FILE_FORMAT = (
TYPE = 'PARQUET'
);

Export to Google Cloud Storage

Syntax

COPY INTO 'gcs://<bucket>[/<path>]'
FROM [<database>.]<table_name>
CONNECTION = (
gcs_base_url = '<string>'
[, disable_oauth = true | false]
[, client_email = '<string>']
[, private_key = '<string>']
)
FILE_FORMAT = (
TYPE = {'PARQUET' | 'CSV' | 'JSON'}
[, DELIMITER = '<character>']
[, WITH_HEADER = true | false]
)

Parameter Description

NameDescription
TYPESet the file format to: PARQUET, CSV, JSON, for example: FILE_FORMAT = (TYPE='CSV')
DELIMITEROnly support CSV, set file format, for example: DELIMITER = ','
WITH_HEADEROnly supports CSV, whether it has a header, default is true, example: example: FILE_FORMAT = (TYPE='CSV' WITH_HEADER=true)
gcs_base_urlBase URL of Google Cloud Store
disable_oauth(Optional) Whether to disable OAuth, default is false
client_email(Optional) The email address of the service account, only required when OAuth is not disabled
private_key(Optional) The private key of the service account, only required when OAuth is not disabled

Example

Export data as JSON to Google Cloud Storage
COPY INTO 'gcs://tmp/air'
FROM "air"
CONNECTION = (
gcs_base_url = 'https://storage.googleapis.com',
disable_oauth = false,
client_email = 'service_account@example.com',
private_key = '-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----'
)
FILE_FORMAT = (
TYPE = 'JSON'
);
Export data as CSV to Google Cloud Storage
COPY INTO 'gcs://tmp/air'
FROM air
CONNECTION = (
gcs_base_url = 'https://storage.googleapis.com',
disable_oauth = false,
client_email = 'service_account@example.com',
private_key = '-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----'
)
FILE_FORMAT = (
TYPE = 'CSV',
DELIMITER = ',',
WITH_HEADER = true
);

Export to Azure Blob Storage

Syntax

COPY INTO 'azblob://<container>[/<path>]'
FROM [<database>.]<table_name>
CONNECTION = (
account = '<string>'
[, access_key = '<string>']
[, bearer_token = '<string>']
[, use_emulator = true | false]
)
FILE_FORMAT = (
TYPE = {'PARQUET' | 'CSV' | 'JSON'}
[, DELIMITER = '<character>']
[, WITH_HEADER = true | false]
)

Parameter Description

These parameters provide the information needed to connect to and authenticate with Azure storage accounts.Depending on the specific situation, you may authenticate using access_key, or authenticate using bearer_token.

NameDescription
TYPESet the file format to: PARQUET, CSV, JSON, for example: FILE_FORMAT = (TYPE='CSV')
DELIMITEROnly support CSV, set file format, for example: DELIMITER = ','
WITH_HEADEROnly supports CSV, whether it has a header, default is true, example: example: FILE_FORMAT = (TYPE='CSV' WITH_HEADER=true)
accountThe name of the Azure storage account, specifying the storage account to connect to.
access_keyAccess key for storing accounts, used for authentication and authorization.
bearer_tokenThe token required for authentication, which can be used in place of the access_key for authentication purposes.
use_emulatorDefaults to false, when set to true, url uses environment variable AZURITE_BLOB_STORAGE_URL or http://127.0.0.1:10000

Example

Export data to CSV and store it in Azure Blob Storage, specify delimiter as , , and set table header
COPY INTO 'azblob://tmp/air'
FROM "air"
CONNECTION = (
account = '***********',
access_key = '****************'
)
FILE_FORMAT = (
TYPE = 'CSV',
DELIMITER = ',',
WITH_HEADER = true
);
Export data as PARQUET to Microsoft Azure Blob Storage
COPY INTO 'azblob://tmp/air'
FROM "air"
CONNECTION = (
account = '***********',
access_key = '****************'
)
FILE_FORMAT = (
TYPE = 'PARQUET'
);

Import data

Before importing, please make sure the target table already exists and that the column names and types correspond.

Import Data Locally

Syntax

COPY INTO [database.]<table_name>[(<time_col>, <field_col> [,field_col] ...[,TAGS (<tag_col> [, tag_col] ...)])]
FROM '<path>'
FILE_FORMAT = (
TYPE = {'PARQUET' | 'CSV' | 'JSON'}
[, DELIMITER = '<character>']
[, WITH_HEADER = true | false]
)
COPY_OPTIONS = (
auto_infer_schema = true | false
)

Parameter Description

NameDescription
TYPESet the file format to: PARQUET, CSV, JSON, for example: FILE_FORMAT = (TYPE='CSV')
DELIMITEROnly support CSV, set file format, for example: DELIMITER = ','
WITH_HEADEROnly supports CSV, whether it has a header, default is true, example: example: FILE_FORMAT = (TYPE='CSV' WITH_HEADER=true)
auto_infer_schemaWhether to automatically infer the schema of the file, if set to false, the schema of the target table will be used

Example

Import PARQUET files from local directory
COPY INTO "air"
FROM 'file:///tmp/air/'
FILE_FORMAT = (
TYPE = 'PARQUET'
);
Import CSV file from local directory

DELIMITER=',' indicates that the file is delimited by ','.

WITH_HEADER = true indicates that the file has a header.

auto_infer_schema = true indicates that the header will be automatically inferred

COPY INTO "air"
FROM "file:///tmp/air/"
FILE_FORMAT = (
TYPE = 'CSV',
DELIMITER=',',
WITH_HEADER = true
)
COPY_OPTIONS = (
auto_infer_schema = false
);

Import data from AWS S3

Syntax

COPY INTO [database.]<table_name>[(<time_col>, <field_col> [,field_col] ...[,TAGS (<tag_col> [, tag_col] ...)])]
FROM 's3://<bucket>[/<path>]'
CONNECTION = (
region = '<string>'
, access_key_id = '<string>'
, secret_key = '<string>'
[, token = '<string>' ]
[, virtual_hosted_style = true | false ]
)
FILE_FORMAT (
TYPE = {'PARQUET' | 'CSV' | 'JSON'}
[, DELIMITER = '<character>']
[, WITH_HEADER = true | false]
)
COPY_OPTIONS = (
auto_infer_schema = true | false
)

Parameter Description

NameDescription
TYPESet the file format to: PARQUET, CSV, JSON, for example: FILE_FORMAT = (TYPE='CSV')
DELIMITEROnly support CSV, set file format, for example: DELIMITER = ','
WITH_HEADEROnly supports CSV, whether it has a header, default is true, example: example: FILE_FORMAT = (TYPE='CSV' WITH_HEADER=true)
auto_infer_schemaWhether to automatically infer the schema of the file, if set to false, the schema of the target table will be used
regionAWS Region Code
access_key_idAccess key ID
secret_keySecret Key
token(Optional) Temporary Authorization Token
virtual_hosted_styleEnable Virtual-Hosted Style access mode by default to true

Example

Import CSV data from AWS S3

DELIMITER=',' indicates that the file is delimited by ','.

WITH_HEADER = true indicates that the file has a header.

auto_infer_schema = true indicates that the header will be automatically inferred

COPY INTO "air"
FROM 's3://temp/air'
CONNECTION = (
region = 'us‑east‑1',
access_key_id = '****************',
secret_key = '****************'
)
FILE_FORMAT = (
TYPE = 'CSV',
DELIMITER = ',',
WITH_HEADER = true
);
Import PARQUET data from AWS S3
COPY INTO "air"
FROM 's3://tmp/air'
CONNECTION = (
region = 'us‑east‑1',
access_key_id = '****************',
secret_key = '****************'
)
FILE_FORMAT = (
TYPE = 'PARQUET'
);

Import data from Google Cloud Storage

Syntax

COPY INTO [database.]<table_name>[(<time_col>, <field_col> [,field_col] ...[,TAGS (<tag_col> [, tag_col] ...)])]
FROM 'gcs://<bucket>[/<path>]'
CONNECTION = (
gcs_base_url = '<string>'
[, disable_oauth = true | false]
[, client_email = '<string>']
[, private_key = '<string>']
)
FILE_FORMAT (
TYPE = {'PARQUET' | 'CSV' | 'JSON'}
[, DELIMITER = '<character>']
[, WITH_HEADER = true | false]
)
COPY_OPTIONS = (
auto_infer_schema = true | false
)

Parameter Description

NameDescription
TYPESet the file format to: PARQUET, CSV, JSON, for example: FILE_FORMAT = (TYPE='CSV')
DELIMITEROnly support CSV, set file format, for example: DELIMITER = ','
WITH_HEADEROnly supports CSV, whether it has a header, default is true, example: example: FILE_FORMAT = (TYPE='CSV' WITH_HEADER=true)
auto_infer_schemaWhether to automatically infer the schema of the file, if set to false, the schema of the target table will be used
gcs_base_urlBase URL of Google Cloud Store
disable_oauth(Optional) Whether to disable OAuth, default is false
client_email(Optional) The email address of the service account, only required when OAuth is not disabled
private_key(Optional) The private key of the service account, only required when OAuth is not disabled

Example

Import data from Google Cloud Storage with JSON data
COPY INTO 'gcs://tmp/air'
FROM air
CONNECTION = (
gcs_base_url = 'https://storage.googleapis.com',
disable_oauth = false,
client_email = 'service_account@example.com',
private_key = '-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----'
)
FILE_FORMAT = (
TYPE = 'JSON'
);
Import data from Google Cloud Storage with CSV data
COPY INTO 'gcs://tmp/air'
FROM air
CONNECTION = (
gcs_base_url = 'https://storage.googleapis.com',
disable_oauth = false,
client_email = 'service_account@example.com',
private_key = '-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----'
)
FILE_FORMAT = (
TYPE = 'CSV',
DELIMITER = ',',
WITH_HEADER = true
);

Import data from Azure Blob Storage

Syntax

COPY INTO [database.]<table_name>[(<time_col>, <field_col> [,field_col] ...[,TAGS (<tag_col> [, tag_col] ...)])]
FROM 'azblob://<container>[/<path>]'
CONNECTION = (
account = '<string>'
[, access_key = '<string>']
[, bearer_token = '<string>']
[, use_emulator = true | false]
)
FILE_FORMAT = (
TYPE = {'PARQUET' | 'CSV' | 'JSON'}
[, DELIMITER = '<character>']
[, WITH_HEADER = true | false]
)
COPY_OPTIONS = (
auto_infer_schema = true | false
)

Parameter Description

NameDescription
TYPESet the file format to: PARQUET, CSV, JSON, for example: FILE_FORMAT = (TYPE='CSV')
DELIMITEROnly support CSV, set file format, for example: DELIMITER = ','
WITH_HEADEROnly supports CSV, whether it has a header, default is true, example: example: FILE_FORMAT = (TYPE='CSV' WITH_HEADER=true)
accountThe name of the Azure storage account, specifying the storage account to connect to.
access_keyAccess key for storing accounts, used for authentication and authorization.
bearer_tokenThe token required for authentication, which can be used in place of the access_key for authentication purposes.
use_emulatorDefaults to false, when set to true, url uses environment variable AZURITE_BLOB_STORAGE_URL or http://127.0.0.1:10000

Example

Import CSV data from Azure Blob Storage
COPY INTO "air"
FROM 'azblob://tmp/air'
CONNECTION = (
account = '***********',
access_key = '****************'
)
FILE_FORMAT = (
TYPE = 'CSV',
DELIMITER = ',',
WITH_HEADER = true
);
Import PARQUET data from Azure Blob Storage
COPY INTO "air"
FROM 'azblob://tmp/air'
CONNECTION = (
account = '***********',
access_key = '****************'
)
FILE_FORMAT = (
TYPE = 'PARQUET'
);