I have a database table with a clustered index, basically on a timestamp. If I insert a large amount of data out of order relative to that clustered index (ie. I write data with a timestamp from any point in time), what sort of performance impact will occur? Will it require the data to do a lot of reordering of the data on disk, or rewrite indexes, etc?
impact of writing data out of order relative to clustered index
38 Views Asked by steve8918 At
1
There are 1 best solutions below
Related Questions in SQL
- SQL schema for a fill-in-the-blank exercise
- Hibernate: JOIN inheritance question - why the need for two left joins
- What's supposed to be the problem in this query?
- Compare fields in two tables
- How to change woocomerce or full wordpress currency with value from USD to AUD
- Dynamic query creation with Array like implementation
- SQL query to get student enrolled in this month in a course - Moodle
- SQL LAG() function returning 0 for every row despite available previous rows
- Convert C# DateTime.Ticks to Bigquery DateTime Format
- Use row values from another table to select them as columns and establish relations between them (pivot table)
- SQL: Generate combination table based on source and destination column from same table
- how to use system's environnement variables in sql script
- PHP fetchAll on JOIN
- Multitable joining in Sql
- How to display name starting from 'z' by using BETWEEN cmd only?
Related Questions in DATABASE-PERFORMANCE
- How to calculate hinted_handoff_throttle_in_kb to performance of hints handoff?
- Postgres duplicate indexes
- Custom search query for many-to-many related tables optimization
- Temporary shutdown of autovacuum during rush hour
- This query is slow because I think I'm using a lot of id values in the ANY() clause, isn't it?
- Mongo db update document, with conditional push data
- NodeJS MongoDB Taking Too Long to load and decompress a long list of Images
- Will Upgrading Azure SQL Managed Instance Storage from 512 GB to 1024 GB Improve Performance Based on Premium SSD Size Chart?
- Dynamically create tables by foreign key
- Optimizing Eloquent Queries for Complex Reporting System in Laravel
- How to optimize this postgresql query which is running for more than 24 hours.it process 60,000 data inside the loop
- Vastly different query performance on GIN indexed table using different search words
- In Oracle subquery takes lots of time compared to SQL Server
- Why is it more efficient to create a new table instead of altering a column?
- Single vs Multiple databases in arangodb
Related Questions in CLUSTERED-INDEX
- Does ULID Satisfy MySQL Clustered Index Design for Primary Keys?
- External fragmentation when we do not have clustered index
- Best method for updating SQL Server partition function
- DELETE Command taking a lot of time with clustered index
- impact of writing data out of order relative to clustered index
- What is the difference clustered vs non-clustered index when we calculate number of accesses pages?
- SQL Server not using non clustered index
- Sqlserver how to create composite unique key with one of column is nullable
- BETWEEN SQL - why is there a gap lock?
- when exactly does a lock pseudo-record supremum occur?
- gap locks don't appear in data_locks table - What's happening?
- InnoDB Locking - Does record lock use indexes?
- Clustered index key should be part of the non-clustered index key, but how can I see it?
- Can table have both primary & clustered index together
- Why is this query using the index differently when i change the range of search?
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
You didn't reveal the DB vendor, hardware configuration, the schema, the data volume, nor any observed timing figures.
Zero impact. (Or go ahead, change my mind, publish timing figures.)
No.
B-tree node splits are a thing.
Different technologies describe this in different ways. For example, Oracle refers to the fill factor of an index. Typically more than half of each DB block will be full of user data. The fraction of each block that is cleared to zeros will tend to be smallest when ordered rows were used to populate an empty relation. So the size of an index can vary by 2x based on the initial INSERT pattern. Sometimes query speed is dominated by total time to read all index blocks.
You didn't show us EXPLAIN or ANALYZE details from an example query, which prevents us from predicting what happens in your use case.
It's easy to rebuild an index, or indeed an entire relation, after all rows have become available. Just tacking an ORDER BY onto the end of a giant INSERT query can help. Measure performance before and after such a refresh to identify whether there's a user-visible difference in the elapsed time.