Skip to content

FULL OUTER JOIN support #4526

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 9 commits into
base: master
Choose a base branch
from
Draft

FULL OUTER JOIN support #4526

wants to merge 9 commits into from

Conversation

oeed
Copy link
Contributor

@oeed oeed commented Mar 7, 2025

This PR adds support for FULL OUTER JOIN queries, which are supported by both Postgres and SQLite (but not MySQL). I'm unaware if there are a specific reason it wasn't originally included, perhaps just as they're not super common, but I've run it to a situation where I do so figured it was worthwhile PRing.

This isn't quite ready, but it has the general structure. It primarily just needs tests and checking that it's done in a good way. Have left a few comments with questions about whether its taking the best approach.

@oeed oeed changed the title FULL OUTER JOIN FULL OUTER JOIN support Mar 7, 2025
Copy link
Member

@weiznich weiznich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. This is great to see ❤️.

I think that's already a solid start implementation wise. I would like to tweak the backend restrictions a bit to make this extendable by third party backends + restrict it really only to postgres + sqlite. Otherwise the implementation is already ready in my opinion.

What is still missing are tests + documentation. Also the compile test coverage could be improved. For tests it would be great to extend the existing join tests here to have at least:

  • A test for a plain full join
  • A test for using several full joins in a row
  • Mixing full joins with left join and inner join

@oeed
Copy link
Contributor Author

oeed commented Mar 7, 2025

Great, note re docs and tests.

That's a good point about 3rd party backends. How would you recommend doing so, a trait implemented on the backend? Are there any existing examples of that?

Sorry was on mobile and didn't realised you'd replied to each note.

@weiznich
Copy link
Member

weiznich commented Mar 7, 2025

That's a good point about 3rd party backends. How would you recommend doing so, a trait implemented on the backend? Are there any existing examples of that?

See the instructions here: #4526 (comment)

@oeed oeed force-pushed the full-outer-join branch from 3c47828 to abd9f11 Compare March 9, 2025 21:11
self.left
.source
.default_selection()
.nullable()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually flawed, and I can't seem to find a good solution. Because .nullable() always wraps the selection in Nullable, even if already Nullable, when using multiple full_join() calls you end up with (Nullable<(Nullable<A>, Nullable<B>)>, Nullable<C>), rather than (Nullable<A>, Nullable<B>, Nullable<C>).

One solution could be to make a IntoNullable trait on expressions (akin to the sql_type version), which would essentially return Self if already Nullable<...>. However, I've been getting conflicting impl errors between the T and Nullable<T> blocks and can't seem to find a way to get this to work. Any ideas or other approaches that could be taken? Seemingly the fact it's checking IsNull with the indirection from Expression::SqlType is preventing Rust from realising they're disjoint as is done in the sql_type version. (still fails if adding + SingleValue)

/// Convert values into their nullable form, without double wrapping them in `Nullable<...>`.
pub trait IntoNullable<ST> {
    /// The nullable value of this type.
    ///
    /// For all values except `Nullable`, this will be `Nullable<Self>`.
    type Nullable;

    /// Convert this value into its nullable representation.
    ///
    /// For `Nullable<T>`, this remain as `Nullable<T>`, otherwise the value will be wrapped and be `Nullable<Self>`.
    fn into_nullable(self) -> Self::Nullable;
}

impl<T> IntoNullable for T
where
    T: Expression,
    T::SqlType: SqlType<IsNull = sql_types::is_nullable::NotNull>,
{
    type Nullable = Nullable<T>;

    fn into_nullable(self) -> Self::Nullable {
        Nullable(self)
    }
}

impl<T> IntoNullable for Nullable<T>
where
    Nullable<T>: Expression,
    <Nullable<T> as Expression>::SqlType: SqlType, // also fails with SqlType<IsNull = sql_types::is_nullable::IsNullable>
{
    type Nullable = Self;

    fn into_nullable(self) -> Self::Nullable {
        self
    }
}

Then replace Nullable<Left::DefaultSelection> with <Left::DefaultSelection as IntoNullable>::Nullable.

Alternatively, we could add a AppendSelection impl
for Nullable<(Nullable<Left>, Nullable<Mid>)> which unpacks it, but you have the same type issues.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what could work is using the associated type specialization trick used in other places.

That means we would need to restructure that trait like follows:

// that shouldn't be part of the public exposed API, i.e. it shouldn't be nameable by 
// users outside of diesel
// Also: Prefer going with adding the `Expression` prefix everywhere 
// as it will otherwise be confused with the sql type trait
// (That already happens from time to time for the types, no 
// need to introduce another case here.)
pub trait IntoNullableExpression<E> {
    type NullableExpression;
    fn into_nullable_expression(e: E) -> Self:::NullableExpression;
}

impl<ST, E> IntoNullableExpression<E> for ST 
where 
    E: Expression<SqlType = ST>,
    ST: SqlType,
    // forward it to the other two impls
    (ST::IsNull, ST): IntoNullableExpression<E>,
{
    type NullableExpression = <(ST::IsNull, ST) as IntoNullableExpression<E>>::NullableExpression;

    fn into_nullable_expression(e: E) -> Self::NullableExpression {
        <(ST::IsNull, ST) as IntoNullableExpression<E>>::into_nullable_expression(e)
    }
}

// these impls cannot overlap with each other as the Self type is clearly
// different. They also shouldn't overlap with the generic one
// above as `SqlType` is only implemented for tuples if all elements
// implement `SqlType` on their own, which is not the case for `is_nullable::NotNull` 
// and `is_nullable::IsNullable`
impl<ST, E> IntoNullableExpression<E> for (is_nullable::NotNull, ST)
where
    E: Expression<SqlType = ST>,
    ST: SqlType<IsNull = is_nullable::NotNull>,
{
    type NullableExpression = Nullable<E>;

    fn into_nullable_expression(e: E) -> Self::NullableExpression {
        Nullable(self)
    }
}

impl<ST, E> IntoNullableExpression<E> for (is_nullable::IsNullable, ST)
where
    E: Expression<SqlType = ST>,
    ST: SqlType<IsNull = is_nullable::IsNullable>,
{
    type NullableExpression = E;

    fn into_nullable_expression(e: E) -> Self::NullableExpression {
        e
    }
}

in that way you have to two impl that are clearly not overlapping as they are for different types and a third impl on top of it that forwards to the other impls. You then need to call it slightly different as <E::SqlType as IntoNullableExpression<E>>::into_nullable_expression(e) from the calling side, but given that this is only one location + that's not public API I would consider that acceptable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh clever! Will try update to use that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This runs in to an issue unfortunately, as tuples seem to be IsNullable if any of the values are IsNullable. e.g. (Integer, Text, Nullable<Text>)::IsNull == IsNullable. Thus, it keeps (Integer, Text, Nullable<Text>) as (Integer, Text, Nullable<Text>), rather than turning it in to Nullable<(Integer, Text, Nullable<Text>)>.

Tuples are using OneIsNullable, should they perhaps instead be using AllAreNullable? Or if that's correct, is there another way to do this check?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No that's correct for tuples. Giving it another thought I noticed that this all is not required. (Nullable<(Nullable<A>, Nullable<B>)>, Nullable<C>) is as far as I remember the expected result here. The sql type code handles unwrapping that to something more meaningful if I remember that correctly. In fact that seems to be also the thing we do for LEFT JOINS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure I follow sorry, I was under the impression that AppendSelection should instead give (Nullable<A>, Nullable<B>, Nullable<C>)? In its doc comment it says:

Used to ensure the sql type of left.join(mid).join(right) is (Left, Mid, Right) and not ((Left, Mid), Right).

The following compiles correctly using left joins:

    let source: Vec<(User, Option<Post>, Option<Comment>, Option<Like>)> = users::table
        .left_join(posts::table.on(posts::user_id.eq(users::id)))
        .left_join(comments::table.on(posts::id.eq(comments::post_id)))
        .left_join(likes::table.on(comments::id.eq(likes::comment_id)))
        .load(connection)
        .unwrap();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That means I need to have another look at this to give better suggestions. I'm not sure if I find the time this or next week. If that doesn't happen until then please ping me again.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed c718389 which at least fixes compiling all the tests. I'm still not 100% sure if this impl is correct. It might require an additional step that wraps every element in the tuple returned from the Join query source (left hand side) in a Nullable wrapper if it's not already nullable. I think this would apply if you have first a inner join and then a full join. If that is still problematic it should be possible to implement such a tuple mapping by having a helper trait for all supported tuple sizes that just applies the IntoNullableExpression trait to all tuple elements.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that, I added a test that did confirm that inner join then full join is problematic, so I've added IntoNullableExpression for tuples.

It is now essentially working, however, I've needed to add IntoNullableExpression for SkipSelectableExpressionBoundCheckWrapper as impl<ST, E> IntoNullableExpression<E> for ST needs to be where ST: SingleValue to avoid conflicting with the tuple impls and SkipSelectableExpressionBoundCheckWrapper no longer has an impl. That's conflicting with impl<ST, E> IntoNullableExpression<E> for ST though. Looking forward to negative trait bounds stabilising...

I don't have more time right now to try and fix, but will come back in a few more days when I get the time. If you have thoughts on it let me know though.

@oeed oeed requested a review from weiznich March 11, 2025 03:42
Comment on lines +83 to +84
// it also fails if we join to another subjoin
let _ = users::table.full_join(comments::table).full_join(posts::table.full_join(comments::table));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there is no error for this case

Comment on lines +22 to +23
.full_join(posts::table.on(users::id.eq(posts::id)))
.get_result::<(Option<i32>, Option<i32>)>(&mut connection);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a valid selection here so that the error does not contain the unrelated mismatch of what the query returns and what the return type expects?

self.left
.source
.default_selection()
.nullable()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what could work is using the associated type specialization trick used in other places.

That means we would need to restructure that trait like follows:

// that shouldn't be part of the public exposed API, i.e. it shouldn't be nameable by 
// users outside of diesel
// Also: Prefer going with adding the `Expression` prefix everywhere 
// as it will otherwise be confused with the sql type trait
// (That already happens from time to time for the types, no 
// need to introduce another case here.)
pub trait IntoNullableExpression<E> {
    type NullableExpression;
    fn into_nullable_expression(e: E) -> Self:::NullableExpression;
}

impl<ST, E> IntoNullableExpression<E> for ST 
where 
    E: Expression<SqlType = ST>,
    ST: SqlType,
    // forward it to the other two impls
    (ST::IsNull, ST): IntoNullableExpression<E>,
{
    type NullableExpression = <(ST::IsNull, ST) as IntoNullableExpression<E>>::NullableExpression;

    fn into_nullable_expression(e: E) -> Self::NullableExpression {
        <(ST::IsNull, ST) as IntoNullableExpression<E>>::into_nullable_expression(e)
    }
}

// these impls cannot overlap with each other as the Self type is clearly
// different. They also shouldn't overlap with the generic one
// above as `SqlType` is only implemented for tuples if all elements
// implement `SqlType` on their own, which is not the case for `is_nullable::NotNull` 
// and `is_nullable::IsNullable`
impl<ST, E> IntoNullableExpression<E> for (is_nullable::NotNull, ST)
where
    E: Expression<SqlType = ST>,
    ST: SqlType<IsNull = is_nullable::NotNull>,
{
    type NullableExpression = Nullable<E>;

    fn into_nullable_expression(e: E) -> Self::NullableExpression {
        Nullable(self)
    }
}

impl<ST, E> IntoNullableExpression<E> for (is_nullable::IsNullable, ST)
where
    E: Expression<SqlType = ST>,
    ST: SqlType<IsNull = is_nullable::IsNullable>,
{
    type NullableExpression = E;

    fn into_nullable_expression(e: E) -> Self::NullableExpression {
        e
    }
}

in that way you have to two impl that are clearly not overlapping as they are for different types and a third impl on top of it that forwards to the other impls. You then need to call it slightly different as <E::SqlType as IntoNullableExpression<E>>::into_nullable_expression(e) from the calling side, but given that this is only one location + that's not public API I would consider that acceptable.

oeed and others added 3 commits March 14, 2025 12:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants