-
Notifications
You must be signed in to change notification settings - Fork 116
Chunk download latency #634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
@jayantsing-db can you take a look, this PR adds latency logs (merges into sea-migration) as there are some refactorings that this PR leverages |
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for making the changes
Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This reverts commit b57c3f3.
* allow empty schema bytes for alignment with SEA Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * pass is_vl_op to Sea backend ExecuteResponse Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove catalog requirement in get_tables Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * move filters.py to SEA utils Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * ensure SeaResultSet Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * prevent circular imports Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove unused imports Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove cast, throw error if not SeaResultSet Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * pass param as TSparkParameterValue Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove failing test (temp) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove SeaResultSet type assertion Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * change errors to align with spec, instead of arbitrary ValueError Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * make SEA backend methods return SeaResultSet Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * use spec-aligned Exceptions in SEA backend Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove defensive row type check Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * raise ProgrammingError for invalid id Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * make is_volume_operation strict bool Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove complex types code Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * Revert "remove complex types code" This reverts commit 138359d. * introduce type conversion for primitive types for JSON + INLINE Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove SEA running on metadata queries (known failures Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-necessary docstrings Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * align expected types with databricks sdk Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * link rest api reference to validate types Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove test_catalogs_returns_arrow_table test metadata commands not expected to pass Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix fetchall_arrow and fetchmany_arrow Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove thrift aligned test_cancel_during_execute from SEA tests Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-necessary changes in example scripts Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-necessary chagnes in example scripts Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * _convert_json_table -> _create_json_table Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove accidentally removed test Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove new unit tests (to be re-added based on new arch) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove changes in sea_result_set functionality (to be re-added) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * introduce more integration tests Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove SEA tests in parameterized queries Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove partial parameter fix changes Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-necessary timestamp tests (pass with minor disparity) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * slightly stronger typing of _convert_json_types Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * stronger typing of json utility func s Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * stronger typing of fetch*_json Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove unused helper methods in SqlType Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * line breaks after multi line pydocs, remove excess logs Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * line breaks after multi line pydocs, reduce diff of redundant changes Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * reduce diff of redundant changes Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * mandate ResultData in SeaResultSet constructor Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove complex type conversion Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * correct fetch*_arrow Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * recover old sea tests Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * move queue and result set into SEA specific dir Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * pass ssl_options into CloudFetchQueue Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * reduce diff Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove redundant conversion.py Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix type issues Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * ValueError not ProgrammingError Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * reduce diff Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * introduce SEA cloudfetch e2e tests Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * allow empty cloudfetch result Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add unit tests for CloudFetchQueue and SeaResultSet Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * skip pyarrow dependent tests Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * simplify download process: no pre-fetching Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * correct class name in logs Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * align with old impl Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * align next_n_rows with prev imple Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * align remaining_rows with prev impl Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-necessary Optional params Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-necessary changes in thrift field if tests Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove unused imports Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * init hybrid * run large queries Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * hybrid disposition Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-ncessary log Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove redundant tests Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * multi frame decompression of lz4 Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * ensure no compression (temp) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * introduce separate link fetcher Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * log time to create table Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add chunk index to table creation time log Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove custom multi-frame decompressor for lz4 Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove excess logs * remove redundant tests (temp) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add link to download manager before notifying consumer Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * move link fetching immediately before table creation so link expiry is not an issue Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * resolve merge artifacts Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove redundant methods Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * introduce callback to handle link expiry Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix types Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix param type in unit tests Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting + minor type fixes Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * Revert "introduce callback to handle link expiry" This reverts commit bd51b1c. * remove unused callback (to be introduced later) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * correct param extraction Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove common constructor for databricks client abc Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * make SEA Http Client instance a private member Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * make GetChunksResponse model more robust Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add link to doc of GetChunk response model Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * pass result_data instead of "initial links" into SeaCloudFetchQueue Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * move download_manager init into parent CloudFetchQueue Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * raise ServerOperationError for no 0th chunk Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * unused iports Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * return None in case of empty respose Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * ensure table is empty on no initial link s Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * account for total chunk count Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * iterate by chunk index instead of link Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * make LinkFetcher convert link static Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add helper for link addition, check for edge case to prevent inf wait Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add unit tests for LinkFetcher Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-necessary download manager check Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-necessary string literals around param type Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove duplicate download_manager init Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * account for empty response in LinkFetcher init Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * make get_chunk_link return mandatory ExternalLink Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * set shutdown_event instead of breaking on completion so get_chunk_link is informed Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * docstrings, logging, pydoc Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * use total_chunk_cound > 0 Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * clarify that link has already been submitted on getting row_offset Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * return None for out of range Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * default link_fetcher to None Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> --------- Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * Chunk download latency (#634) * chunk download latency Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * formatting Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * test fixes Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * sea-migration static type checking fixes Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * check types fix Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * fix type issues Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * type fix revert Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * - Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * statement id in get metadata functions Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * removed result set extractor Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * databricks client type Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * formatting Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * remove defaults, fix chunk id Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * added statement type to command id Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * check types fix Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * renamed chunk_id to num_downloaded_chunks Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * set statement type to query for chunk download Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * comment fix Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * removed dup check for trowset Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> --------- Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * acquire lock before notif + formatting (black) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix imports Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add get_chunk_link s Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * simplify description extraction Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * pass session_id_hex to ThriftResultSet Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * revert to main's extract description Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * validate row count for sync query tests as well Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * guid_hex -> hex_guid Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * reduce diff Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * reduce diff Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * reduce diff Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * set .value in compression Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * reduce diff Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove redundant test Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * move extra_params to the back Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * is_direct_results -> has_more_rows Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * Revert "is_direct_results -> has_more_rows" This reverts commit 0e87374. * stop passing session_id_hex Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove redundant comment Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add extra_params param Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * pass extra_params into test_...unset... Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove excess session_id_he Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * reduce changes in DatabricksRetryPolicy Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * reduce diff in DatabricksRetryPolicy Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * simple comments on proxy setting Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * link docs for getproxies)( Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * rename proxy specific attrs with proxy prefix Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> --------- Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com>
* remove redundant conversion.py Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix type issues Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * ValueError not ProgrammingError Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * reduce diff Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * introduce SEA cloudfetch e2e tests Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * allow empty cloudfetch result Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add unit tests for CloudFetchQueue and SeaResultSet Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * skip pyarrow dependent tests Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * simplify download process: no pre-fetching Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * correct class name in logs Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * align with old impl Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * align next_n_rows with prev imple Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * align remaining_rows with prev impl Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-necessary Optional params Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-necessary changes in thrift field if tests Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove unused imports Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * init hybrid * run large queries Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * hybrid disposition Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-ncessary log Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove redundant tests Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * multi frame decompression of lz4 Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * ensure no compression (temp) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * introduce separate link fetcher Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * log time to create table Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add chunk index to table creation time log Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove custom multi-frame decompressor for lz4 Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove excess logs * remove redundant tests (temp) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add link to download manager before notifying consumer Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * move link fetching immediately before table creation so link expiry is not an issue Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * resolve merge artifacts Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove redundant methods Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * introduce callback to handle link expiry Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix types Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix param type in unit tests Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting + minor type fixes Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * Revert "introduce callback to handle link expiry" This reverts commit bd51b1c. * remove unused callback (to be introduced later) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * correct param extraction Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove common constructor for databricks client abc Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * make SEA Http Client instance a private member Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * make GetChunksResponse model more robust Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add link to doc of GetChunk response model Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * pass result_data instead of "initial links" into SeaCloudFetchQueue Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * move download_manager init into parent CloudFetchQueue Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * raise ServerOperationError for no 0th chunk Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * unused iports Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * return None in case of empty respose Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * ensure table is empty on no initial link s Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * account for total chunk count Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * iterate by chunk index instead of link Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * make LinkFetcher convert link static Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add helper for link addition, check for edge case to prevent inf wait Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add unit tests for LinkFetcher Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-necessary download manager check Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-necessary string literals around param type Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove duplicate download_manager init Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * account for empty response in LinkFetcher init Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * make get_chunk_link return mandatory ExternalLink Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * set shutdown_event instead of breaking on completion so get_chunk_link is informed Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * docstrings, logging, pydoc Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * use total_chunk_cound > 0 Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * clarify that link has already been submitted on getting row_offset Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * return None for out of range Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * default link_fetcher to None Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> --------- Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * Chunk download latency (#634) * chunk download latency Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * formatting Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * test fixes Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * sea-migration static type checking fixes Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * check types fix Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * fix type issues Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * type fix revert Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * - Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * statement id in get metadata functions Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * removed result set extractor Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * databricks client type Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * formatting Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * remove defaults, fix chunk id Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * added statement type to command id Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * check types fix Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * renamed chunk_id to num_downloaded_chunks Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * set statement type to query for chunk download Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * comment fix Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * removed dup check for trowset Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> --------- Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * acquire lock before notif + formatting (black) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix imports Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add get_chunk_link s Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * simplify description extraction Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * pass session_id_hex to ThriftResultSet Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * revert to main's extract description Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * validate row count for sync query tests as well Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * guid_hex -> hex_guid Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * reduce diff Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * reduce diff Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * reduce diff Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * set .value in compression Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * reduce diff Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * is_direct_results -> has_more_rows Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * preliminary large metadata results Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * account for empty table in arrow table filter Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * align flows Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * align flow of json with arrow Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * case sensitive support for arrow table Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-necessary comment Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix merge artifacts Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove redundant method Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove incorrect docstring Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove deepcopy Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> --------- Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com>
What type of PR is this?
Description
Record chunk download latency
Added
chunk_id
toSqlExecutionEvent
andResultDownloadHandler
How is this tested?
Ran the query:
SELECT * FROM RANGE(20000000)
Latency was recorded for 4 chunks.
The latency log for the 1st chunk:
Related Tickets & Documents
PECOBLR-653