![]() > The EXPLAIN QUERY PLAN SQL command is used to obtain a high-level description of the strategy or plan that SQLite uses to implement a specific SQL query. Sqlite> SELECT * FROM demo WHERE foo = '123' // incurs a severe query plan performance regression without immediate feedback Sqlite> SELECT * FROM demo WHERE foo = 123 Sqlite> CREATE VIRTUAL TABLE demo USING parquet('parquet-generator/99-rows-1.parquet') sqlite-parquet-vtable reads parquet with arrow for SQLite virtual tables : nested structs with portable types, (and jsonschema, )? #nbmeta #linkedresearch ![]() ADD COLUMN commands against a virtual table.Īre there URIs in the schema? Mustn't there thus be a meta-schema that does e.g. How does JOIN performance vary amongst sqlite virtual table implementations? sqlite-parquet-vtable implements shadow tables to memoize row group filters. Indices cannot be added separately using CREATE INDEX statements.) (Virtual tables can have indices but that must be built into the virtual table implementation. > - One cannot create additional indices on a virtual table. Just posted about eBPF a few days ago opcodes have costs that are or are not costed: > - One cannot create a trigger on a virtual table. # "The Virtual Table Mechanism Of SQLite" : Flight SQL aims to get rid of these intermediate steps. Meanwhile, while APIs like ODBC do provide bulk access to result buffers, this data must still be copied into Arrow arrays for use with the broader Arrow ecosystem, as implemented by projects like Turbodbc. Row-based APIs like JDBC or PEP 249 require transposing data in this case, and for a database which is itself columnar, this means that data has to be transposed twice-once to present it in rows for the API, and once to get it back into columns for the consumer. > Motivation: While standards like JDBC and ODBC have served users well for decades, they fall short for databases and clients which wish to use Apache Arrow or columnar data in general. "Introducing Apache Arrow Flight SQL: Accelerating Database Access" (2022). ![]() "Comparing SQLite, DuckDB and Arrow with UN trade data" (2021) partial benchmarks of query time and RAM requirements would be It's my understanding that PostgreSQL's foreign data wrappers (a similar feature to SQLite's virtual tables) push much more information about the query down to the wrapper layer, but I haven't used it myself. There's nothing I can do about it (other than heuristically guessing what IDs might be coming and request them ahead of time) because SQLite doesn't provide enough information to the virtual table layer. I wrote a module that exposes remote SQL Server/PostgreSQL/MySQL servers as SQLite virtual tables, and joins basically don't work at all if your server is not on your local network. If you're writing a virtual table that accesses a remote resource with some latency, any join will absolutely ruin your performance as you pay a full network roundtrip for each of those N queries. Instead, you will receive a series of N queries for individual IDs, even if you could have more efficiently retrieved them in a batch. ![]() You're forced to retrieve and return every row because you have no idea that it was actually just a count.Īs another example, if the user query includes a join, you won't see the join. That's great, unless you already know the count and could have reported it directly rather than actually returning every row in the table. the query your virtual table will effectively see is: Many parts of the query are not pushed down into the virtual table. You don't need to know much and indeed, you can't know much about the execution engine, even if that knowledge would help you. I like the virtual table API a lot but it has some serious drawbacks. ![]()
0 Comments
Leave a Reply. |