-
Notifications
You must be signed in to change notification settings - Fork 1.9k
feat(dashboards): allow one insight on many dashboards #9331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@clarkus the only UI impact I can see of this is the flow of showing what dashboards an insight is on, navigating to them from the insight, and adding it to more |
Yeah, we'll definitely need a reworked modal to go from a boolean approach (either is on a dashboard or isn't) to a number (anywhere between 0 and n dashboards). |
Added many insights using
and inserted them in one step as a synchronous migration
This took between 18 and 25 seconds over three attempts A million is a lot of insights but this demonstrates that a single insert migration could timeout and cause issues with upgrades |
That's my main worry with splitting |
That's a valid concern. As we don't know what's our customer p100 sort by row count on self-hosted, we should make this migration idempotent and split it into chunks in a loop (e.g. 100 rows/query) that will be retried in case of failures. There's also a (probably marginal) race condition: t0 - app is running |
…es to new dashboard insight relation
I made some progress from the insights side of things. This allows for adding an insight to one or more dashboards, or creating a new dashboard and then adding. I am working through the dashboard side of things next. |
TO-DO
TO FOLLOW-UP_WITH
|
…after connecting dashboards and insights
Q: Is there a link to somewhere I can see where we decided doing this makes sense / solves a good use case / etc. etc. ? This mini-RFC feels more implementation focused, and I'm missing context 😅 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is probably the least disruptive way to bring about this database change 👍.
This is still a draft, so I'm not sure what is in the TODO list and what is not, nevertheless here and inline are some comments/questions. Happy to address them myself as well.
-
@pauldambra when you did your test here,... did the table you were inserting to have an index or not? Those 18-25 seconds for copying 1M rows from one table to another seem like a lot. Could it be that this was the case, and this insert could be made much faster by cleverly dropping and creating indexes? This feels way too long.
-
We already need an explicit table for the relation, as the fields
layouts
andcolor
should go there, and be removed frominsight
. They can be unique for each dashboards.
# check permissions on any dashboard on legacy dashboard relation or the many-to-many dashboards relation | ||
dashboards: List[Dashboard] = [insight.dashboard] + list(insight.dashboards.all()) | ||
edit_permissions = [d.can_user_edit(cast(User, request.user).id) for d in dashboards if d is not None] | ||
return all(edit_permissions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue you should be able to edit an insight if you can edit any
dashboard the insight is on, not only if you can edit all dashboards the insight is on.
As an example, when someone from the company's board adds an insight onto their private dashboard, you wouldn't expect everyone else to automatically lose edit rights? I'd expect the opposite: the private board dashboards should also show, that this specific insight has been shared more broadly.
from posthog.test.base import BaseTest | ||
|
||
|
||
class TestDashboardInsightModel(BaseTest): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the DashboardInsight
itself. Should this be the table between dashboards and insights? If so, is this the right name for such a table, given it'll contain 1) what to show 2) coordinates of the item on the dashboard, 3) color of the item. The "item" is currently just an insight, but it can also be a block of text, the events list table, or whatever. Should we make a new many-to-many model for every new type of dashboard element?
This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the |
superceded by #9416 |
@neilkakkar I missed your comment, sorry There's not much more there but the RFC for the insights work is here https://github.com/PostHog/product-internal/blob/5c4efc9f46416dabd239a5ef931613e035596a18/requests-for-comments/2022-03-14%20nail%20insights.md |
Opening a PR as a mini-RFC
Dashboards Model relations
Currently
At the model level
Insight
s can be saved by themselves or on aDashboard
If they are saved to a dashboard and the dashboard is deleted the insight is deleted too
The relationship is different depending on which side you start at
Insight
can be on zero or oneDashboard
Dashboard
can have zero to manyInsight
sThe foreign key is specified on
Insight
. The relationship specifies a "related_name" soDashboard
has a relationship back toInsight
At the API level
loading a dashboard
https://app.posthog.com/api/projects/:team_id/dashboards/:dashboard_id/?refresh=false
Viewing a dashboard makes an API call for that dashboard by ID
That call returns a structure
loading an insight
https://app.posthog.com/api/projects/:team_id/insights/?short_id=the-short-id
Viewing an insight directly makes an API call by short id
This call returns a paginated structure (ideally (always?) with one result)
The API maps the same relationship:
Desired relationship
The desired relationship is that:
Both insights and dashboards can exist without the other
Django many-to-many relationships implicitly create a join table
Manually creating that join model would allow future changes which could store data unique to that dashboard/item relation
E.g. a different date range applied to the same insight on different dashboards
Complications
The insight model is stored in a posthog_dashboarditem table
Dashboard Item might not be the correct name for the relation. Dashboards may in future hold more than only insights. E.g. Markdown blocks
Leaving Insights on "dashboard item" constrains either future naming, or means a markdown block would be modelled as an insight in code.
To change this we would need a migration to rename the table or to copy the data to a new table
That migration would need to work for cloud and self-host
the insight model already has a one way relationship direct to dashboard
Any migration to a join table would need to migrate the items relationship off Insight and keep any existing relationship
Proposed Approach
Ultimately, the approach is to add a new relation onto
Dashboard
insights = models.ManyToManyField(Insight)
and deprecate the existing
dashboard
field onInsight
This allows querying in both directions. And can have a manual "through" table specified in case future changes need to store data associated with a particular insight/dashboard combo. E.g. different date filters on the "last month" dashboard and "this week" dashboard
(optional) Rename the posthog_dashboarditems table
PostgreSQL documentation on alter table for rename says:
It would need testing, but this suggests we can have a single migration and code change that renames the insight table and changes the table name in the model
TODO This could lead to a DB that has had the table renamed and running instances that are expecting the old table name. That would need to be addressed, adding complication to this work
It is quicker to leave Insights on this table until PostHog 2.0 allows breaking changes and add a TODO to that effect on the model
change the Insight API to present an array of dashboard ids
Currently, the API presents a single dashboard ID. The UI uses this to provide a button to jump to the dashboard. This would need to change to a dropdown button showing the one to many dashboards you could visit from a given insight
The following UI interactions would need to change:
There are 17 places that the UI code knows that dashboard ID is
number | null
and changing tonumber | number[]| null
or adding a newnumber[]
property would need to be accounted forPut a new relation on dashboards to a new DashboardItem model
insights = models.ManyToManyField(Insight)
Initially empty, the dashboard API would read from both
insights
anditems
relations and union and deduplicate the results of this relation and the existingItems
relation3. Migrate existing dashboard-insight relation onto the new model
In a perfect world this could be as simple as looping through
Insight
models, prefetching theirdashboard
, and storing the relation onto the newinsights
relation on that instance ofDashboard
Based on the number of Insights stored on cloud. This would take more than long enough to cause deployments to fail or data to appear to be inconsistent.
The join migration would need to be manually written as a fast set based SQL operationsee commentOr the
Insight
model would union and deduplicate its old and new relation and present that to the outside world. While an async migration paged through the DB updates needed