YCQL Timestamp Precision (Millis vs Micros)

21 Nov 2024
Product Affected Versions Related Issues Fixed In
YCQL All #11052, #23476 v2.25.0.0, v2024.1.4.0, v2024.2.1.0, v2.20.9.0

Description

In YCQL, timestamp data inserted with the currenttimestamp() function or explicit timestamp strings with microseconds precision (for example, 2024-08-26 19:23:38.537281+0000) may not be returned for queries with an equality condition on the timestamp. This can cause discrepancies during queries or index scans on tables where timestamps were part of the primary key. The data can still be retrieved as expected by primary key or index-only scan queries with no condition or inequality conditions on the timestamp column.

The root cause is that in YCQL, the timestamp type has millisecond precision, to align with vanilla Cassandra, ensuring compatibility with Cassandra drivers, and tools. However, YCQL can internally store data with microsecond precision, adjusting the precision to milliseconds when data is sent, or received through the YCQL API. The discrepancy between the wire protocol's millisecond precision and the storage system's microsecond precision can lead to unexpected behavior.

Mitigation

For new writes

For versions before v2.25.0.0, v2024.2.1.0, v2024.1.4.0, and v2.20.9.0, you can adjust the new writes to use the correct precision by modifying your application to use totimestamp(now()) instead of currenttimestamp() to generate millisecond-precision timestamp values that are compatible with the wire protocol.

For versions starting from v2.25.0.0, v2024.2.1.0, v2024.1.4.0, and v2.20.9.0, the database can automatically convert timestamps with microsecond precision to milliseconds if cql_revert_to_partial_microsecond_support flag is set to false (default is true). When this flag is disabled (set to false), timestamps with microsecond precision are truncated (that is, floored) to milliseconds before being stored in underlying storage, ensuring consistency with the YCQL layer. When the flag is set to true, timestamps are stored in the underlying storage with the microseconds precision.

For existing data

The above fixes do not retroactively update any values already inserted into the database before the fix. Existing timestamp data stored with microseconds precision will not be automatically converted, and queries involving these timestamps may still face the same issues unless the data is reinserted or updated.

If you have already inserted data with non-zero microsecond precision, contact Yugabyte Support for assistance running a script to read and re-insert the affected rows with millisecond precision.

Details

Cassandra (the base for YCQL) only supports timestamps with millisecond precision. However, YCQL's underlying storage can store timestamps with microsecond precision, leading to inconsistencies. When timestamps with non-zero microseconds were inserted into YCQL, they were stored in the underlying storage with microsecond precision. However, when queried, the Cassandra layer rounded these timestamps to milliseconds, losing the microsecond details.

This can create issues in certain scenarios as follows:

  • Fetch mismatch: When querying rows inserted using current timestamp() or explicit timestamp strings, the SELECT * query would return timestamps rounded to milliseconds. If the same value was then used in a WHERE clause to filter data, it would fail to fetch the expected rows.
  • Index scan failure: For tables with a timestamp as part of the primary key and a secondary index, index scans would fail. The secondary index would retrieve the timestamp stored in milliseconds, but when fetching the full table data from underlying storage (where it was stored in microseconds), the values would not match.

Examples

  • With cql_revert_to_partial_microsecond_support=true

    
    desc test
    CREATE TABLE tempkeyspace.test (
        id int PRIMARY KEY,
        created_date timestamp
    ) WITH default_time_to_live = 0
        AND transactions = {'enabled': 'true'};
    
    INSERT INTO test(id, created_date) VALUES (1, currenttimestamp()) ;
    
    INSERT INTO test(id, created_date) VALUES (2, '2024-09-25 15:52:40.768819+0000') ;
    
    SELECT * FROM t1;
    id | created_date
    ----+---------------------------------
      1 | 2024-09-30 05:54:12.340000+0000
      2 | 2024-09-25 15:52:40.768000+0000
    
    SELECT * FROM test WHERE created_date='2024-09-30 05:54:12.340000+0000';
    
    id | created_date
    ----+--------------
    
    SELECT * FROM test WHERE created_date='2024-09-25 15:52:40.768000+0000';
    
    id | created_date
    ----+--------------
    
  • With cql_revert_to_partial_microsecond_support=false

    desc test
    CREATE TABLE tempkeyspace.test (
        id int PRIMARY KEY,
        created_date timestamp
    ) WITH default_time_to_live = 0
        AND transactions = {'enabled': 'true'};
    
    INSERT INTO test(id, created_date) VALUES (1, currenttimestamp()) ;
    
    INSERT INTO test(id, created_date) VALUES (2, '2024-09-25 15:52:40.768819+0000') ;
    
    SELECT * FROM t1;
    id | created_date
    ----+---------------------------------
      1 | 2024-09-30 05:58:19.550000+0000
      2 | 2024-09-25 15:52:40.768000+0000
    
    SELECT * FROM test WHERE created_date='2024-09-30 05:58:19.550000+0000';
    id | created_date
    ----+---------------------------------
      1 | 2024-09-30 05:58:19.550000+0000
    
    
    SELECT * FROM test WHERE created_date='2024-09-25 15:52:40.768000+0000';
    
      id | created_date
    ----+---------------------------------
      2 | 2024-09-25 15:52:40.768000+0000