-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Description
HogQL / Data Exploration TODO
These are the next steps to work on. Scroll below for old completed points.
Insight conversion
HogQL
- Support dashboard filters HogQL insight works with Dashboard filters posthog#15999
- HogVM action matching. It's done. feat(hogql): bytecode, local evaluation and action matching in plugin server posthog#16189
- Build a full type system, type all functions, fix a lot of null issues.
- Improve JSON support. Treat every string as a json field if accessed with a property. Extract array if passed string where array is expected.
- Strings in intervals,
interval '5 day'
- Date Units https://clickhouse.com/docs/en/sql-reference/functions/date-time-functions
- Make a decision on
int
&float
properties (all "number" properties are presented the JSONnumber
aka the SQLfloat
now). Do we want "int" properties? - Make a decision: move
team_id
guard into PREWHERE, or keep in WHERE - Add all
OrDefault
aggregations - Support empty strings
''
and null strings'null'
in materialized properties. - Connect the read only database user for HogQL queries
- Do we want to keep extracting macros from the WITH clause, or embrace CTE-s?
- Some other ClickHouse SQL grammar we need to support?
- Double check we can parse everything in the antlr grammar.
- Fuzzing testing - randomly generated allowed queries shouldn't cause exceptions and failures
- Recording and session filters
- Automatic benchmarks
- Query tables for other projects within an organization
- Performance boost: Transform table selects into subqueries Always filter by
team_id
early in HogQL joins posthog#15966 - Performance boost: Joined subquery AND optimisation (move
persons
AND
filters intoraw_persons
subquery) - Performance boost: HogQL: Filter groups through subquery instead of join posthog#21320
- Performance boost: HogQL optimization that filters events early on posthog#21269
- Work out pricing for API usage
HogQL feature completeness
- HogQL: Support for CTEs in a union query posthog#20663
- HogQL: Support for
JOIN ... USING ...
posthog#20660 - HogQL: Support for
JOIN ... AS ...
posthog#20661 - HogQL: Support for distinct aggregate functions like
countDistinctIf
posthog#20662 - HogQL: Support list placeholders and nullable placeholders posthog#20667
- HogQL functions should ignore casing
COALESCE
vscoalesce
posthog#20668 - HogQL aggregations should ignore casing
SUM
vssum
posthog#21469
Event explorer UX
- Why do we remove the http:// from the domains? (slack thread)
- Clicking on cells should filter by value (normal cell), or drill down further (aggregation number cell).
- "Load more" should also expand the time range when you're at the end of "last 24h".
- Finish the lemon data grid
- Implement UX feedback from feat(data-exploration): improved editing UX posthog#13726 (review)
- Sorting by "event" with autocapture is odd (table says "clicked on" not "$autocapture bla bla", so sort order makes it not clear)
Pivot tables
- Create a separate EventsPivotNode,
source
either HogQLQuery or EventsQuery - Add buttons to the events table and BI UI to make a pivot table
Insights, Dashboards, Frontend
- Make
<Query query={} />
something any other React app can import and use - Make the Query tag work within sidebar apps too.
- Add throttling, caching and timeouts to
dataNodeLogic
, especially for dashboards.
Consolidated 2023-02-02
The lists above were getting long, so moved (2nd of Feb 2023) all the completed points below:
Events table
- Add person and cohort tables @mariusandra
- Get basic HogQL in @mariusandra
-
Aggregated node should nest the events nodeMake new EventsQuery node - Rework the Column Configurator to support formulas and aggregations while looking slick-ish.
- Fix data table regressions @mariusandra
- Reported bug on slack
- Add pagination
- Restore date separator rows (and category rows if sorting by something else?)
- What to do with
coalesce(url, screen)
and filters?
- Support HogQL filters in the interface (I mean editing the query node's
where
part in the UI, and not exactly this loosely related PR) - Add a date range selector on top of the table (defaults to 3 days if no aggregations, and 30 if with aggregations)
- How to avoid too much load on our cluster? Users will be able to query for almost anything after all (all time sort by properties.bla).
- Replace column header sort icons with dropdown menus, add filtering and sorting there.
- Get query nodes into backend. Parse data from there.
- Yank HogQL out of property filters and into
where
-
Bring back saved columnsNot happening, will save as insights instead. - Move events list query to posthog/queries/event_list.py and refactor.
- Create a
property_filter_to_hogql
abstraction
- Create a
- SQL errors into the frontend feat(data-exploration): errors in data table posthog#13703 (done in the hogql unleashed PR)
HogQL
- Get basic HogQL expressions working for "select" and grouping
- Get HogQL filtering working (including "having" and comparisons)
- Figure out which clickhouse functions we want to enable (e.g.
startOfDay
), and which not (evalMLMethod
). - Use HogQL expressions in property filters as a separate "sql expression" filter type
- Create an interface to enter a HogQL expression in the app's UI
- Support HogQL filters in all the filters and breakdowns
- Add missing aggregations (count distinct property)
- Fix problem with
%
breaking things due to python'sstring.format
(moved vars to vars, but someone could still write a symbol thatf""
interprets down the line. Make sure we combine the query in a clear way) - Add support for tracking person property types. @macobo
- Create a technical/language document for HogQL, showing what is allowed and what isn't (we have the grammar)
Backend
- Create a
/query
endpoint, and add a router on the backend. At first support just the newEventsQuery
Pivot tables
- Map out the problem space on the query level
- Add the relevant parameters to the data node queries and create a MVP
Insights, Dashboards, Frontend
- Port over insights and their filters to use data exploration natively @thmsobrmlr (done, just cleaning up) Data Exploration Insights #84
- Port over dashboards to store data exploration queries @pauldambra
- Get rid of antd @Twixes (done enough)
Consolidated 2023-06-06
This list was pasted on the 6th of June, 2023 after launching the HogQL public beta.
HogQL
- Python-based HogQL feat(data-exploration): formulas in tables, custom columns, aggregations posthog#13203
- HogQL filters in existing insights feat(data-exploration): HogQL property filters posthog#13264
- Create an ANTLR based HogQL parser feat(hogql): antlr grammar posthog#13971
- Merged
- Templating/fragment system for query building feat(hogql): placeholders and cleanup posthog#14116
- And a bunch of other cleanups.
- Merged
- SELECT statements feat(data-exploration): direct SQL querying posthog#14042
- Support most SELECT statements feat(hogql): select statements posthog#14131
- Add automatic
where
for team_id when making queries in ClickHouse - Support name resolution, aliasing (e.g.
select e.bla from events e
), conflict avoidance feat(hogql): symbol resolution posthog#14185 - Support
*
expansion on tables feat(hogql): asterisk expander posthog#14271 - Intervals feat(hogql): intervals posthog#14370
- Virtual database schema that could be customised per team (if we choose to implement this)
- Merged
-
HogQLQuery
node in/query
feat(hogql): HogQLQuery node and data table support posthog#14265- Pretty print
- Flagged
- Merged
- Person properties join
- As AST mangler for HogQL select statements
- Join person_distinct_id automatically feat(hogql): automatic person table joins posthog#14286
- Join persons automatically
- Still works with all the old stuff
- Clean code
- Merged
- Events table built on top of HogQL feat(hogql): Events table based on hogql posthog#14315
- Rename symbols to refs chore(hogql): rename symbols to refs posthog#14410
- Property filters to HogQL
- Action steps to HogQL
- Merged
- Finally launch HogQL expressions
- Basic documentation Document HogQL expressions posthog.com#5416
- Remove autofocus fix(hogql-inline): remove autofocus posthog#14489
- Release flag
- Announce to users
- Remove flag chore(hogql): remove flag from hogql expressions posthog#14616
- Launch events table via HogQL
- Fetch properties directly between shards feat(hogql): properties between shards posthog#14549
- Cohort filters feat(hogql): cohort filter posthog#14600
- Improve 5sec "load new results" interval fix(events-explorer): events explorer improvements posthog#14648
- Make it faster fix(events-explorer): events explorer improvements posthog#14648
- Launch to beta users
- Query cancellation at the
dataNodeLogic
level feat(data-exploration): cancel hogql & events queries posthog#14662
- Odds and ends
- UNION ALL feat(hogql): support UNION ALL posthog#14593
- Database schema (and PoE setting) per team feat(hogql): custom virtual database per team posthog#14740
- Clean up the confusion with our bespoke
countDistinct()
function feat(hogql): count distinct and count star posthog#14786 - Database schema management view feat(hogql): database schema page posthog#14591
- Automatic type conversion (e.g.
toInt
) for event and person properties, per team feat(hogql): automatic event and person property types posthog#14795 - WITH and macros/expressions feat(hogql): WITH support & macros posthog#14495
- #107
- HogQL/SQL insight feat: hogql editor pane posthog#14688
- HogQL on dashboards
- Rename
/query
to/debug
chore(hogql): rename /query to /debug posthog#14729 - Open as new insight feat: replace open in query builder with new insight page posthog#14709
- Always add table names in front of all columns? feat(hogql): always prepend table name in clickhouse sql posthog#14837
- Resource usage safeguards feat(hogql): add readonly=1 and max_execution_time=60 on every query posthog#14870 feat(query): rate limit to 120 requests/min posthog#14873
- Launch sql insights to beta users. chore(hogql): consolidate all parts of HogQL under one feature flag (and enable HogQL beta) posthog#14871
- More odds and ends
- Add timezone support (pass zone to all toDateTime calls) feat(hogql): timezone support posthog#14968
- JSON property access feat(hogql): JSON property access posthog#14976
- Expose more ClickHouse functions feat(hogql): more functions posthog#14969
- Really lazy tables feat(hogql): really lazy tables posthog#14927
- Arrays feat(hogql): arrays and tuples posthog#14986
- Lambdas feat(hogql): lambdas posthog#14987
- Get HogQL errors on insights out of sentry and into the interface feat(hogql): better errors posthog#15001
- Beta development final stretch
- Rename "ref"-s to "type"-s feat(hogql): rename "ref"-s to "type"-s, introduce types posthog#15106
- Type classes feat(hogql): type class refactor, field to constant type resolving posthog#15129
- Fix: rename toTimezone fix(hogql): rename toTimezone to toTimeZone posthog#15127
- Fix: nested macros fix(hogql): nested macros posthog#15126
- Tuple access feat(hogql): tuple access posthog#15130
- More aggregations feat(hogql): more aggregations posthog#15131
- Window functions feat(hogql): window functions posthog#15132
- Merge resolver, macros, asterisks feat(hogql): refactor macros and asterisks into "resolver" posthog#15175
- Support PoEv2 overrides table feat(hogql): add poe v2 support posthog#15120
- Null handling feat(hogql): handle null properties posthog#15244
- Group filters part 1 feat(hogql): groups support - part 1 posthog#15231
- Trends HogQL volume feat(insights): trends hogql volume posthog#15672
- ... and many other PRs
- LAUNCH PUBLIC BETA
- Bring back event explorer saved columns feat(event-explorer): bring back project-specific columns posthog#15899
- Launch sql insights to all users.
- Launch events table to all users
Events table
- Convert the events table to use HogQL internally (partially done)
- Get the events table back up and running for beta users
- Show reload button feat(event-explorer): always show the reload button posthog#14785
Pivot tables
- Implement Pivot tables MVP to prove it's doable
- Implement pivot tables through HogQL
Consolidated 2023-12-07
HogQL
- Post-beta shortlist
- Make errors more helpful
- Async Inline SQL expression validation feat(hogql): inline error validation posthog#15881
- Async full SQL select query validation feat(hogql): errors for large full select queries posthog#15957
- Figure out which ClickHouse SQL errors we can expose to users feat(3000): Expose safe CH errors in HogQL query endpoint posthog#15959
- Figure out which ClickHouse SQL errors (uncaught in the HogQL layer) happen most often. --> Will add as needed.
- Do we need to build a full type system to catch some things before they reach ClickHouse? --> no, but would be cool
- Write some docs for the
/query
API endpoint. Ideally autogenerated. HogQL API docs (the most basic version) posthog.com#6096 - String null handling for concat feat(hogql): null-tolerant concat posthog#15975
- Make a decision on properties and NULLs (slack thread, older context) - going to use as many nulls as possible
- Null handling for
==
and!=
fix(hogql): null handling with == and != comparisons, attempt 2 posthog#16259 - Nullish Coalesching feat(hogql): nullish coalescing posthog#16276
- Remove feature flag feat(hogql): remove feature flag posthog#15936
- System of modifiers to swap performance optimisations in/out
- Make errors more helpful
- Hotfixes requested by customers
- Timestamp in a subquery (slack thread) fix(hogql): proper datetime aliases for select columns posthog#15970
- Fix empty list filter fix(hogql): empty properties array in filter posthog#15971
- Array property access feat(hogql): property array access posthog#15965
-
tumble
doesn't work fix(hoql): convert datetimes for clickhouse tumble function posthog#16000 - Selecting from the S3 table feat(hogql): select from s3 posthog#16038
- Selecting count() from a lazy table feat(hogql): add session_replay_events table posthog#16061
- Add support for cohort matching from within the query feat(hogql): person_id in cohort posthog#16119
-
toDate(timestamp)
does not work. What's the alternative? fix(hogql): Properly compiletoDate
based on type info posthog#16757 - Selecting from the numbers table feat(hogql): numbers() table posthog#16945
- Add WITH FILL WITH FILL for HogQL posthog#17002
- HogVM action matching feat(hogql): bytecode, local evaluation and action matching in plugin server posthog#16189
- Misc cleanup feat(hogql): bunch of improvements (HogVM part 1) posthog#16274
- Python & JS VM feat(hogql): local evaluation (HogVM part 2) posthog#16275
- Cohort matching feat(hogvm): async operations for in cohort support (HogVM part 3) posthog#16279
- Get it live
- PoEv2 part 2
- Custom SQL columns feat(hogql): custom SQL columns posthog#15180
- Wrap up this refactor: feat(hogql): working branch / resolver refactor posthog#15194
- Swap out person_id with an
if
when PoEv2 enabled
- Documentation improvements
- Link to the ClickHouse function equivalent in the list of HogQL functions.
- Other developments
- Graphs directly from HogQL data (e.g. events table + aggregation -> graph)
- Charts from HogQL queries if possible
- Benchmark query parsing and printing (parsing very large blobs of SQL text seemed to be slow)
- Group filters with joins and aliases (
where events.organization.name = 'PostHog'
) feat: Added lazy joins for groups on events posthog#17950 - Fix
toDate(now())
(we aliastoDate
totoDateOrNull
, which only takes a String argument, not DateTime) fix(hogql): Properly compiletoDate
based on type info posthog#16757 - Array joins
- Add time range selector (important for dashboard-level time range, ZEN-3438)
- Clean up the
/debug
view (add sidebar, better examples, etc) - Support column aliases in event explorer feat(event-explorer): column aliases posthog#15861
- Re-enable event explorer "turbo mode"
-
select session.duration
Data Exploration / Interlinking
- Support AST reuse with
EventsQuery().to_ast()
orHogQLQuery().to_ast()
- Support persons modal with
{ kind: PersonsQuery, source: { kind: InsightQuery, day: 3, breakdown: 'fish' } }
- Backend support for custom SQL expressions feat(hogql): Expression Fields & PoEv2 posthog#18811
- Subqueries you can save as views you can select from (not released?)
AdelDima and abegehr
Metadata
Metadata
Assignees
Labels
No labels