Managing data format schemas in Managed Service for ClickHouse
Managed Service for ClickHouse lets you INSERT
and SELECT
data in different formats. Most of those formats are self-descriptive. This means that they already contain a format schema that describes acceptable data types, their order, and representation in this format. For example, it lets you directly insert data from a file.
Note
Format schema describes the format of data input or output and the data schema describes the structure and layout of the ClickHouse databases and tables that store this data. These concepts are not interchangeable.
The Cap'n Proto
You can add one or more such format schemas to a Managed Service for ClickHouse cluster and use them to input and output data in the relevant formats.
Warning
To use the format schemas you added, insert the data into Managed Service for ClickHouse using the HTTP interface
For more information about data formats, see the ClickHouse documentation
Examples of working with the Cap'n Proto and Protobuf formats when inserting data into a cluster are given in Adding data to ClickHouse.
Before connecting the format schema
Managed Service for ClickHouse only works with readable data format schemas imported to Object Storage. Before connecting the schema to a cluster:
-
Prepare a file with a format schema (see the documentation for Cap'n Proto
and Protobuf ). -
Import the file with the data format schema to Object Storage.
-
Get a link to the schema file.
Connecting the format schema
- In the management console
, go to the folder page and select Managed Service for ClickHouse. - Click the cluster name and open the Data format schemas tab.
- Click Add schema.
- In the Add schema dialog box, fill out the form by completing the URL field with the previously generated link to the format schema file.
- Click Add.
If you don't have the Nebius AI command line interface yet, install and initialize it.
The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name
or --folder-id
parameter.
To connect a format schema to a cluster, run the command:
-
For Cap'n Proto:
ncp managed-clickhouse format-schema create "<format schema name>" \ --cluster-name="<cluster name>" \ --type="capnproto" \ --uri="<link to the file in Object Storage>"
-
For Protobuf:
ncp managed-clickhouse format-schema create "<format schema name>" \ --cluster-name="<cluster name>" \ --type="protobuf" \ --uri="<link to the file in Object Storage>"
The cluster name can be requested with a list of clusters in the folder.
Changing a format schema
Managed Service for ClickHouse doesn't track changes in the format schema file that is in the Object Storage bucket.
To update the contents of a schema that is already connected to the cluster:
- Upload the file with the current format schema to Object Storage.
- Get a link to this file.
- Change the parameters of the format schema that is connected to Managed Service for ClickHouse by providing a new link to the format schema file.
- In the management console
, go to the folder page and select Managed Service for ClickHouse. - Click the cluster name and open the Data format schemas tab.
- Select the appropriate schema, click
If you don't have the Nebius AI command line interface yet, install and initialize it.
The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name
or --folder-id
parameter.
To change the link to the file in object storage with the format schema, run the command:
ncp managed-clickhouse format-schema update "<data schema name>" \
--cluster-name="<cluster name>" \
--uri="<new link to the file in Object Storage>"
You can request the schema name with a list of format schemas in the cluster and the cluster name with a list of clusters in the folder.
Disabling a format schema
Note
After disabling a format schema, the corresponding object is kept in the Object Storage bucket. If this object with the format schema is no longer needed, you can delete.
- In the management console
, go to the folder page and select Managed Service for ClickHouse. - Click the cluster name and open the Data format schemas tab.
- Select the appropriate schema, click
If you don't have the Nebius AI command line interface yet, install and initialize it.
The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name
or --folder-id
parameter.
To disable a format schema, run the command:
ncp managed-clickhouse format-schema delete "<format schema name>" \
--cluster-name="<cluster name>"
You can request the schema name with a list of format schemas in the cluster and the cluster name with a list of clusters in the folder.
Getting a list of format schemas in a cluster
- In the management console
, go to the folder page and select Managed Service for ClickHouse. - Click the cluster name and open the Data format schemas tab.
If you don't have the Nebius AI command line interface yet, install and initialize it.
The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name
or --folder-id
parameter.
To get a list of format schemas in a cluster, run the command:
ncp managed-clickhouse format-schema list --cluster-name="<cluster name>"
The cluster name can be requested with a list of clusters in the folder.
Getting detailed information about a format schema
If you don't have the Nebius AI command line interface yet, install and initialize it.
The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name
or --folder-id
parameter.
To get detailed information about a format schema, run the command:
ncp managed-clickhouse format-schema get "<format schema name>" \
--cluster-name="<cluster name>"
You can request the schema name with a list of format schemas in the cluster and the cluster name with a list of clusters in the folder.