Snowflake Zero Copy Clone 101—An Essential Guide (2024)
Snowflake zero copy clone is an incredibly useful and advanced feature that allows users to clone a database, schema, or table quickly and easily without any additional Snowflake storage costs. What's more, it takes only a few minutes for Snowflake zero copy clone to complete without the need for complex manual configuration, as often done in conventional databases—depending on the size of the source item. This article covers all you need to know about Snowflake zero copy clone.
Let's dive in!
What is Snowflake zero copy clone?
Snowflake zero copy clone, often referred to as "cloning", is a feature in Snowflake that effectively creates an exact copy of a database, table, or schema without consuming extra storage space, taking up additional time, or duplicating any physical data. Instead, a logical reference to the source object is created, allowing for independent modifications to both the original and cloned objects. Snowflake zero copy cloning is fast and offers you maximum flexibility with no additional Snowflake storage costs associated with it.
Save up to 30% on your Snowflake spend in a few minutes!
Use-cases of Snowflake zero copy clone
Snowflake zero copy clone provides users with substantial flexibility and freedom, with use cases like:
- To quickly perform backups of Tables, Schemas, and Databases.
- To create a free sandbox to enable parallel use cases.
- To enable quick object rollback capability.
- To create various environments (e.g., Development,Testing, Staging, etc.).
- To test possible modifications or developments without creating a new environment.
Snowflake zero copy clone provides businesses with smarter, faster, and more flexible data management capabilities.
How does Snowflake zero copy clone work?
The Snowflake zero copy clone feature allows users to clone a database object without making a copy of the data. This is possible because of the Snowflake micro-partitions feature, which divides all table data into small chunks that each contain between 50 and 500 MB of uncompressed data. However, the actual size of the data stored in Snowflake is smaller because the data is always stored compressed. When cloning a database object, Snowflake simply creates new metadata entries pointing to the micro-partitions of the original source object, rather than copying it for storage. This process does not involve any user intervention and does not duplicate the data itself—that's why it's called "zero copy clone".
To gain a better understanding, let's deep dive even further.
To illustrate this, consider a database table, EMPLOYEE table, and its cloned snapshot, EMPLOYEE_CLONE, in a Snowflake database. The metadata layer in Snowflake connects the metadata of EMPLOYEE to the micro-partitions in the storage layer where the actual data resides. When the EMPLOYEE_CLONE table is created, it generates a new metadata set pointing to the same micro-partitions storing the data for EMPLOYEE. Essentially, the clone EMPLOYEE_CLONE table is a new metadata layer for EMPLOYEE rather than a physical copy of the data. The beauty of this approach is that it enables us to create clones of tables quickly without duplicating the actual data, saving time and storage space. Moreover, since the clone shares the same set of micro-partitions as the original table, any changes made to the data in one table will automatically reflect in the other.
In Snowflake, micro-partitions cannot be changed/altered once they are created. Suppose any modifications to the data within a micro-partition need to be made. In that case, a new micro-partition must be created with the updated changes (the existing partition is maintained to provide fail-safe measures and time travel capabilities). For instance, when data in the EMPLOYEE_CLONE table is modified, Snowflake replicates and assigns the modified micro-partition (M-P-3) to the staging environment, updating the clone table with the newly generated micro-partition (M-P-4) and references it exclusively for the EMPLOYEE_CLONE table, thereby incurring additional Snowflake storage costs only for the modified data rather than the entire clone.
What are the benefits of Snowflake zero copy clone?
Snowflake zero copy clone feature offers a variety of beneficial characteristics. Let's look at some of the key benefits:
- Effective data cloning: Snowflake zero copy clone allows you to create fully-usable copies of data without physically copying the data, significantly reducing the time required to clone large objects.
- Saves storage space and costs: It doesn't require the physical duplication of data or underlying storage, and it doesn't consume additional storage space, which can save on Snowflake costs.
- Hassle-free cloning: It provides a straightforward process for creating copies of your tables, schemas, and databases using the keyword "CLONE" without needing administrative privileges.
- Single-source data management: It creates a new set of metadata pointing to the same micro-partitions that store the original data. Each clone update generates new micro-partitions that relate solely to the clone.
- Data Security: It maintains the same level of security as the original data. This ensures that sensitive data is protected even when it's cloned.
What are the limitations of Snowflake zero copy clone?
Snowflake zero copy clone feature offers many benefits. Still, there are certain limitations to keep in mind:
- Resource requirements and performance impact: Cloning operations require adequate computing resources, so excessive cloning can lead to performance degradation.
- Longer clone time for large micro-partitions: Cloning a table with a large number of micro-partitions may take longer, although it is still faster than a traditional copy.
- Unsupported Object Types for Cloning: Cloning does not support all object types.
Which are the objects supported in Snowflake zero copy clone?
Snowflake zero copy clone feature supports cloning of the following database objects:
- Databases
- Schemas
- Tables
- Views
- Materialized views
- Sequences
Note: When a database object is cloned, the clone is not similar to the source object; rather, the clone is a reference to the original object, and modifications to the clone do not affect the source object. The clone will contain a new set of metadata, including a new set of access controls; so, the user must ensure that the appropriate permissions are granted for the clone.
How do access control works with cloned objects in Snowflake?
When using Snowflake's zero copy clone feature, it's important to keep in mind that cloned objects do not automatically inherit copy privileges from the source object. This means that an account admin(ACCOUNTADMIN) or the owner of the cloned object must explicitly grant any required privileges to the newly created clone.
If the source object is a database or schema, the granted privileges of any child objects in the source will be replicated to the clone. But, in order to create a clone, the current role must have the necessary privileges on the source object. For example, tables require the SELECT privilege, while pipelines, streams, and tasks require the OWNERSHIP privilege, and other object types require the USAGE privilege.
What are the account-level objects not supported in Snowflake zero copy clone?
Snowflake zero copy clone doesn't support particular objects that cannot be cloned. These include account-level objects, which exist at the account level. Some examples of account-level objects are:
- Account-level roles
- Users
- Grants
- Virtual Warehouses
- Resource monitors
- Storage integrations
Want to take Chaos Genius for a spin?
It takes less than 5 minutes.
Conclusion
Snowflake zero copy clone feature provides an innovative and cost-efficient way for users to clone tables without using additional Snowflake storage costs. This process streamlines the workflow, allowing databases, tables, and schemas to be cloned without creating separate environments.
This article provided an in-depth overview of Snowflake zero copy clone, from how it works to its potential use cases, and demonstrated how to set up and utilize the feature.
If you're interested in delving into a comprehensive guide that walks you through the process of creating a Snowflake zero copy clone table from the ground up, be sure to take a look at this article!
FAQs
Why is it called zero copy clone?
The term "Zero Copy Clone" is used because Snowflake's cloning process doesn't involve physical data copying. It creates a reference to the source data, eliminating the need for duplication and resulting in zero additional storage costs.
How does Snowflake zero copy clone work?
Snowflake zero copy clone works by creating new metadata entries that point to the micro-partitions of the original source object instead of making a physical copy of the data.
What are the advantages of zero copy cloning Snowflake?
- Effective data cloning without physical duplication, saving time.
- Storage space and cost savings as it doesn't consume additional storage.
- Hassle-free cloning process using the "CLONE" keyword.
- Single-source data management with new metadata for each clone.
- Maintaining data security and access controls.
What are the limitations of Snowflake zero copy clone?
- Resource requirements and potential performance impact.
- Longer clone time for tables with a large number of micro-partitions.
- Not all object types are supported for cloning.
Which objects are supported in Snowflake Zero Copy Cloning?
- Databases
- Schemas
- Tables
- Views
- Materialized views
- Sequences
Can Snowflake objects be cloned?
Yes, individual external named stages in Snowflake can be cloned. External stages refer to buckets or containers in external cloud storage. Cloning an external stage does not affect the referenced cloud storage. However, internal (Snowflake) named stages cannot be cloned.
Can you clone Internal named stages ?
No, Internal named stages cannot be cloned.
How does Zero Copy Cloning save time and money?
Zero Copy Cloning eliminates the need for creating multiple development environments in separate accounts, reducing costs and time spent on creating large copies of production tables.