Return

This Week in Databend #94

May 21, 2023 · 4 min read

PsiACE

Stay up to date with the latest weekly developments on Databend!


Databend is a modern cloud data warehouse, serving your massive-scale analytics needs at low cost and complexity. Open source alternative to Snowflake. Also available in the cloud: https://app.databend.com .

What's On In Databend

Stay connected with the latest news about Databend.

Computed Columns

Computed columns are generated from other columns by a scalar expression. There are two types of computed columns: stored and virtual.

A stored computed column computes and stores the result value when a row is inserted. Use this SQL syntax to create one:

column_name <type> AS (<expr>) STORED

While a virtual computed column is calculated at query time and does not store the result value. To create one, use this SQL syntax:

column_name <type> AS (<expr>) VIRTUAL

VACUUM TABLE

The VACUUM TABLE command helps to optimize the system performance by freeing up storage space through the permanent removal of historical data files from a table. This includes:

  • Snapshots associated with the table, as well as their relevant segments and blocks.
  • Orphan files. Orphan files in Databend refer to snapshots, segments, and blocks that are no longer associated with the table. Orphan files might be generated from various operations and errors, such as during data backups and restores, and can take up valuable disk space and degrade the system performance over time.

VACUUM TABLE requires Enterprise Edition. To inquire about upgrading, please contact Databend Support.

If you are interested in learning more, please check out the resources listed below:

Code Corner

Discover some fascinating code snippets or projects that showcase our work or learning journey.

Enable Cache in Python Binding

Databend supports data caching and query result caching, which can effectively accelerate queries. The Python bindings of Databend also support these features, albeit with slight differences.

For query result caching, SQL statements can be used to set it up, which is very convenient.

>>> from databend import SessionContext 
>>> ctx = SessionContext()
>>> ctx.sql("set enable_query_result_cache = 1")

For data caching, it can be enabled through environment variables.

>>> import os 
>>> os.environ["CACHE_DATA_CACHE_STORAGE"] = "disk"
>>> from databend import SessionContext
>>> ctx = SessionContext()
>>> ctx.sql("select * from system.configs where name like '%data_cache%'")
┌────────────────────────────────────────────────────────────────────────────┐
│ group │ name │ value │ description │
│ String │ String │ String │ String │
├─────────┼──────────────────────────────────────────┼─────────┼─────────────┤
'cache''data_cache_storage''disk'''
'cache''table_data_cache_population_queue_size''65536'''
└────────────────────────────────────────────────────────────────────────────┘

Feel free to use it in your data science workflow:

Highlights

Here are some noteworthy items recorded here, perhaps you can find something that interests you.

  • Read Docs | Date & Time - Formatting Date and Time to learn how to precisely control the format of time and date.
  • Added support for transforming data when loading it from a URI.
  • Added support for replacing with stage attachment.
  • Added bitmap-related functions: bitmap_contains, bitmap_has_all, bitmap_has_any, bitmap_or, bitmap_and, bitmap_xor, etc.
  • Supported intdiv operator //.

What's Up Next

We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.

Remove if_not_exists from the Meta Request

In CreateIndexReq/CreateTableReq, we use if_not_existed to indicate whether an index/table exists.

pub struct CreateIndexReq {
pub if_not_exists: bool,
pub name_ident: IndexNameIdent,
pub meta: IndexMeta,
}

The if_not_exists clause only affects the outcome that is presented to the user, and does not alter the behavior of the meta-service operation.

Therefore, it will be more effective for SchemaApi to provide either a Created or an Exist status code, allowing the caller to determine whether to generate an error message.

Issue #11456 | Moving if_not_exists out of meta request body

Please let us know if you're interested in contributing to this issue, or pick up a good first issue at https://link.databend.rs/i-m-feeling-lucky to get started.

New Contributors

We always open arms to everyone and can't wait to see how you'll help our community grow and thrive.

  • @silver-ymz made their first contribution in #11487. Added five bitmap-related functions.
  • @Jake-00 made their first contribution in #11503. Modified duplicate test case for SOUNDS LIKE syntax.
  • @gitccl made their first contribution in #11507. Added five bitmap-related functions and fixed panic when calling with empty bitmap.

Changelog

You can check the changelog of Databend Nightly for details about our latest developments.

Full Changelog: https://github.com/datafuselabs/databend/compare/v1.1.38-nightly...v1.1.43-nightly


🎉 Contributors
24 contributors

Thanks a lot to the contributors for their excellent work.

🎈Connect With Us

Databend is a cutting-edge, open-source cloud-native warehouse built with Rust, designed to handle massive-scale analytics.

Join the Databend Community to try, get help, and contribute!