Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transformations: Fix true inner join in joinByField transformation #87409

Merged
merged 30 commits into from
May 9, 2024

Conversation

baldm0mma
Copy link
Contributor

@baldm0mma baldm0mma commented May 6, 2024

What is this feature?

This fixes the joinByField's "inner" join option to perform like an actual SQL-like inner join.

Example:

Students

StudentID Name Major
1 John Computer Science
2 Emily Mathematics
3 Michael Physics
4 Jennifer Chemistry

Enrollments

StudentID CourseID Grade
1 CS101 A
1 CS102 B
2 MATH201 A
3 PHYS101 B
5 HIST101 B

The result after applying the inner join transformation looks like the following:

StudentID Name Major CourseID Grade
1 John Computer Science CS101 A
1 John Computer Science CS102 B
2 Emily Mathematics MATH201 A
3 Michael Physics PHYS101 B

The inner join only includes rows where there is a match between the "StudentID" in both tables. In this case, the result does not include "Jennifer" from the "Students" table because there are no matching enrollments for her in the "Enrollments" table.

Why do we need this feature?

It is used by users, but has not been working as expected. Also, an escalation was created for it.

Who is this feature for?

All users.

Which issue(s) does this PR fix?:

Fixes #87361

Special notes for your reviewer:

Please check that:

  • It works as expected from a user's perspective.
  • If this is a pre-GA feature, it is behind a feature toggle.
  • The docs are updated, and if this is a notable improvement, it's added to our What's New doc.

@baldm0mma baldm0mma self-assigned this May 6, 2024
@grafana-delivery-bot grafana-delivery-bot bot added this to the 11.1.x milestone May 6, 2024
@baldm0mma baldm0mma added type/feature-request area/frontend add to changelog area/transformations no-backport Skip backport of PR area/dataviz Anything that relates to Data Visualisation work but is not specific to one panel labels May 6, 2024
@grafana-pr-automation grafana-pr-automation bot added the type/docs Flags the technical writing team for documentation support; auto adds to org-wide docs project label May 7, 2024
@baldm0mma baldm0mma marked this pull request as ready for review May 7, 2024 18:32
@baldm0mma baldm0mma requested review from a team and imatwawana as code owners May 7, 2024 18:32
@baldm0mma baldm0mma requested review from leeoniya and nmarrs and removed request for a team May 7, 2024 18:32
@baldm0mma baldm0mma changed the title WIP - Transformations: Add true tabular inner join to joinByField Transformations: Add true tabular inner join to joinByField May 7, 2024
@baldm0mma baldm0mma changed the title Transformations: Add true tabular inner join to joinByField Transformations: Fix true inner join to joinByField May 7, 2024
@baldm0mma baldm0mma changed the title Transformations: Fix true inner join to joinByField Transformations: Fix true inner join in joinByField transformation May 7, 2024
@@ -761,93 +761,6 @@ describe('JOIN Transformer', () => {
});
});

it('inner joins by time field in reverse order', async () => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not relevant with the new behavior. seriesToColumns has been deprecated.

Copy link
Contributor

@nmarrs nmarrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great overall!

Test dashboard
Tabular Inner and Outer Join Test-1715211190626.json

*
* @returns {Array<Array<string | number | null | undefined>>} The joined tables as an array of arrays, where each array represents a row in the joined table.
*/
function joinInner(tables: AlignedData[]): Array<Array<string | number | null | undefined>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
function joinInner(tables: AlignedData[]): Array<Array<string | number | null | undefined>> {
function joinInnerTabular(tables: AlignedData[]): Array<Array<string | number | null | undefined>> {

Should we keep naming convention the same as the outer join for tabular data? 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update: it seems that for inner we can do both series + tabular data while for some reason outer is differentiated and split out 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. I was originally going to add an INNER (TABULAR) option, but the previous INNER wasn't actually behaving like an INNER anyway, so instead of adding yet another option, @leeoniya suggested we just correct the original INNER's behavior, which I think was the right move. The reason we split the 2 OUTER joins - I believe - is because they behave significantly differently then the other, depending on the data (time series or SQL-like). The INNER, however, due to the nature of an INNER join itself, will behave very similarly for both data types - sans a few edge cases.

Copy link
Contributor

@codeincarnate codeincarnate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave it a spin and looks good! 🚀

@baldm0mma baldm0mma merged commit 9bd44ae into main May 9, 2024
19 checks passed
@baldm0mma baldm0mma deleted the baldm0mma/inner_join_fix branch May 9, 2024 17:01
@kevinwcyu kevinwcyu modified the milestones: 11.1.x, 11.1.0 Jun 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
add to changelog area/dataviz Anything that relates to Data Visualisation work but is not specific to one panel area/frontend area/transformations no-backport Skip backport of PR type/docs Flags the technical writing team for documentation support; auto adds to org-wide docs project type/feature-request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Transformations: Outer Join (tabular) is not joining consistent with SQL-style Outer Join
5 participants