shakti

aboutuniversal database + language for: hedgefunds banks manufacturers iot formula1 .. why: hundred times faster than biqquery redshift azure databricks snowflake .. who: arthur whitney+ thanks to e.l. whitney[1920-1966] multiple putnam winner k.e. iverson[1920-2004] APL turing award'79 [advisor] john cocke [1925-2002] RISC turing award'87 [advisor] compare vs biggie shifty azure sparky flaky .. taxi: 1.1Billion 10-100 times faster/QP$ taq: 1.1Trillion 100-INF times faster/QP$benchmarkaboutk-sql is consistently 100 times faster (or more) than redshift, bigquery, snowflake, spark, mongodb, postgres, .. same data. same queries. same hardware. anyone can run the scripts. benchmarks: taq 1.1Trillion trades and quotes taxi 1.1Billion nyc taxi rides stac .. Taq 1.1T https://www.nyse.com/publicdocs/nyse/data/Daily_TAQ_Client_Spec_v2.2a.pdf q1:select max price by sym,ex from trade where sym in S q2:select sum size by sym,time.hour from trade where sym in S q3:do(100)select last bid by sym from quote where sym in S / point select q4:select from trade[s],quote[s] where price<bid / asof join S is top 100 (10%) time(ms) 16core 100 days q1 q2 q3 q4 k 44 72 63 20 spark 80000 70000 DNF DNF - can't do it postgres 20000 80000 DNF DNF - can't do it .. Taxi 1.1B https://tech.marksblogg.com/benchmarks.html q1:select count by type from trips q2:select avg amount by pcount from trips q3:select count by year,pcount from trips q4:select count by year,pcount,_ distance from trips cpu cost core/ram elapsed machines k 4 .0004 4/16 1 1*i3.2xlarge(8v/32/$.62+$.93) redshift 864 .0900 108/1464 8(1 2 2 3) 6*ds2.8xlarge(36v/244/$6.80) bigquery 1600 .3200 200/3200 8(2 2 1 3) db/spark 1260 .0900 42/336 30(2 4 4 20) 21*m5.xlarge(4v/16/$.20+$.30) Stac https://www.stacresearch.com/ ..documentk.dverb noun adverb + flip bool f' each - char " ab" f/ right over * first name ``ab f\ left scan % int 2 3 4 & where float 2 3e4 \v vars | reverse *fixed 2.00 3.40 \w work < asc \l import > desc *z.d date 2001.01.01 \t timeit = freq *z.t time 12:34:56.789 ~ not *z.l locus 40.75::73.98 ! range , list I/O @ type list (2;3.4) 0' line ? find unique dict {a:2 3} 1' char ^ cut sort table [a:2 3] *2' data # take count *3' set _ drop floor expr :2+x *4' get $ cast str *func {[.]..} *5' ffi if a then b else c *enterprise python import k;k.k('select ..') select [count first last min max sum ..]v by e from t sqrt exp log sin cos; in div mod bar *ffi: csv json kafka parquet avro .. `csv?`csv t:[t:09:30:00 09:30:00;e:"bb";s:`aa`aa;v:2 3;p:2.3 3.4] `lz4?`lz4 x:"t,e,s,v,p\n09:30:00,b,aa,2,2.3\n09:30:00,b,aa,3,3.4\n"sql.dshakti universal database includes: ansi-sql [1992..2011] ok for row/col select. real-sql [1974..2021] atw@ipsa does it better. join: real-easy ansi-ok real: select from T,U ansi: select from T left outer join U group: real-easy ansi-annoy real: select A by B from T ansi: select B, A from T group by B order by B simple: real-easy ansi-easy real: select A from T where C or D, E ansi: select A from T where (C or D)and E complex: real-easy ansi-awful asof/joins select from t,q where price<bid first/last select last bid from quote where sym=`A deltas/sums select from t where 0<deltas price foreignkeys select order.cust.nation.region .. arithmetic x+y e.g. combine markets through time example: TPC-H National Market Share Query 8 http://www.qdpma.com/tpch/TPCH100_Query_plans.html what market share does supplier.nation BRAZIL have by order.year for order.customer.nation.region AMERICA and part.type STEEL? real: select revenue avg supplier.nation=`BRAZIL by order.year from t where order.customer.nation.region=`AMERICA, part.type=`STEEL ansi: select o_year,sum(case when nation = 'BRAZIL' then revenue else 0 end) / sum(revenue) as mkt_share from ( select extract(year from o_orderdate) as o_year, revenue, n2.n_name as nation from t,part,supplier,orders,customer,nation n1,nation n2,region where p_partkey = l_partkey and s_suppkey = l_suppkey and l_orderkey = o_orderkey and o_custkey = c_custkey and c_nationkey = n1.n_nationkey and n1.n_regionkey = r_regionkey and r_name = 'AMERICA' and s_nationkey = n2.n_nationkey and o_orderdate between date '1995-01-01' and date '1996-12-31' and p_type = 'STEEL') as all_nations group by o_year order by o_year; Comparison: real ansi(sqlserver/oracle/db2/sap/teradata/..) install 1 second 100,000 second hardware 1 milliwatt 100,000 milliwatt software 160 kilobyte 8,000,000 kilobyte (+ 10,000,000kilobyte O/S) mediandb 1,000,000 megarow 10 megarow https://docs.microsoft.com/en-us/sql/database-engine/install-windows/install-sql-server?view=sql-server-ver15 shakti is essential for analyzing big (trillion row+) and/or complex data.linuxffipythonmacos