Add support for table valued functions for SQL Server #1839

aharpervc · 2025-05-06T19:22:57Z

This PR adds support for table valued functions for SQL Server, both inline & multi statement functions. For reference, that's the B & C documentation here: https://learn.microsoft.com/en-us/sql/t-sql/statements/create-function-transact-sql?view=sql-server-ver16#b-create-an-inline-table-valued-function

Inline TVF's are defined with AS RETURN, so we have a new CreateFunctionBody::AsReturn variant accordingly. Functions using "AS RETURN" don't have BEGIN/END, so that part of the parsing logic is now conditional. Additionally, the data type parser now supports "RETURNS TABLE" without a table definition.

Multi statement TVF's use named table expressions, so a new NamedTable data type variant was added. I didn't see a great way to integrate this into the existing data type parser (especially without rewinding), so creating this data type happens inside the parse create function logic first by parsing the identifier, then parsing the table definition, then using those elements to produce a NamedTable.

I also added a new test example for each of these scenarios.

aharpervc · 2025-05-07T16:26:19Z

tests/sqlparser_mssql.rs

+        CREATE FUNCTION some_inline_tvf(@foo INT, @bar VARCHAR(256)) \
+        RETURNS TABLE \
+        AS \
+        RETURN (SELECT 1 AS col_1)\


Parentheses are optional for inline tvf return queries, although I think the subquery expr expects/requires them currently.

I added support for that syntax & added a new test case example

UNION is also supported in this syntax but not in this current approach due to using parse_select, which is restricted. I'm comfortable leaving that for later

iffyio

took a quick look and left a couple comments, @aharpervc could you rebase on main now that the other PR has landed, in order to remove the extra diff?

iffyio · 2025-05-08T23:50:52Z

src/parser/mod.rs

+        if self.peek_keyword(Keyword::AS) {
+            self.expect_keyword_is(Keyword::AS)?;
+        }


Suggested change

if self.peek_keyword(Keyword::AS) {

self.expect_keyword_is(Keyword::AS)?;

}

self.parse_keyword(Keyword::AS);

this looks like it could be simplified as above?

iffyio · 2025-05-08T23:52:36Z

src/ast/data_type.rs

+                if fields.is_empty() {
+                    return write!(f, "TABLE");
+                }
+                write!(f, "TABLE({})", display_comma_separated(fields))


I think instead of using the is_empty we can make the fields value an Option and skip the parenthesis only if fields is None. Otherwise we could run into issues if empty fields is allowed in the other variant of this datatype

src/parser/mod.rs

aharpervc · 2025-05-09T15:12:45Z

took a quick look and left a couple comments, @aharpervc could you rebase on main now that the other PR has landed, in order to remove the extra diff?

Done 👍

iffyio · 2025-05-10T00:50:23Z

src/ast/data_type.rs

+    NamedTable(
+        /// Table name.
+        ObjectName,
+        /// Table columns.
+        Vec<ColumnDef>,
+    ),


Suggested change

NamedTable(

/// Table name.

ObjectName,

/// Table columns.

Vec<ColumnDef>,

),

NamedTable {

/// Table name.

table: ObjectName,

/// Table columns.

columns: Vec<ColumnDef>,

},

we can use an anonymous struct in this manner?

Oops, yes we can. Not sure why I did it that way. Done 👍

iffyio · 2025-05-10T00:50:52Z

src/ast/data_type.rs

-    Table(Vec<ColumnDef>),
+    /// [MsSQL]: https://learn.microsoft.com/en-us/sql/t-sql/statements/create-function-transact-sql?view=sql-server-ver16#c-create-a-multi-statement-table-valued-function
+    Table(Option<Vec<ColumnDef>>),
+    /// Table type with a name, e.g. CREATE FUNCTION RETURNS @result TABLE(...).


Could we add a link to the docs that support NamedTable variant?

iffyio · 2025-05-10T00:51:36Z

src/ast/mod.rs

+    /// RETURNS TABLE
+    /// AS RETURN SELECT a + b AS sum;
+    /// ```
+    AsReturnSelect(Select),


maybe we can also include a reference doc link here?

iffyio · 2025-05-10T00:52:58Z

src/parser/mod.rs

-        }));
+            Ok(DataType::NamedTable(
+                ObjectName(vec![ObjectNamePart::Identifier(return_table_name)]),
+                table_column_defs.clone().unwrap(),


can we return an error instead of the unwrap here?

I reworked this code to address your several comments 👍

iffyio · 2025-05-10T00:54:02Z

src/parser/mod.rs

-        self.expect_keyword_is(Keyword::AS)?;
+        let return_table = self.maybe_parse(|p| {
+            let return_table_name = p.parse_identifier()?;
+            let table_column_defs = if p.peek_keyword(Keyword::TABLE) {


I think we can replace the if/else statement here with self.expect_keyword(Keyword::TABLE)

iffyio · 2025-05-10T00:54:42Z

src/parser/mod.rs

-        let statements = self.parse_statement_list(&[Keyword::END])?;
-        let end_token = self.expect_keyword(Keyword::END)?;
+            if table_column_defs.is_none()
+                || table_column_defs.clone().is_some_and(|tcd| tcd.is_empty())


hmm is the clone() here necessary?

iffyio · 2025-05-10T00:57:26Z

src/parser/mod.rs

+            let return_table_name = p.parse_identifier()?;
+            let table_column_defs = if p.peek_keyword(Keyword::TABLE) {
+                match p.parse_data_type()? {
+                    DataType::Table(t) => t,


I think already here we can check that the returned data type is none empty, that would avoid the option invalid case propagating further below (where we have the is_none and is_some checks). e.g.

DataType::Table(Some(t)) if !t.is_empty() => t

iffyio · 2025-05-10T00:57:49Z

src/parser/mod.rs

-        }));
+            Ok(DataType::NamedTable(
+                ObjectName(vec![ObjectNamePart::Identifier(return_table_name)]),
+                table_column_defs.clone().unwrap(),


similarly here, is the clone necessary?

src/parser/mod.rs

iffyio · 2025-05-11T00:08:38Z

src/parser/mod.rs

+                if !matches!(expr, Expr::Subquery(_)) {
+                    parser_err!(
+                        "Expected a subquery after RETURN",
+                        self.peek_token().span.start
+                    )?
+                }
+                Some(CreateFunctionBody::AsReturnSubquery(expr))


Suggested change

if !matches!(expr, Expr::Subquery(_)) {

parser_err!(

"Expected a subquery after RETURN",

self.peek_token().span.start

)?

}

Some(CreateFunctionBody::AsReturnSubquery(expr))

Some(CreateFunctionBody::AsReturnExpr(expr))

thinking since a subquery is already an expr, we can be more permissive here?

I think this is a "could" vs "should" situation. The question here is what is the parser expecting? I don't think it makes sense to parse arbitrary expr's if they wouldn't otherwise be allowed by any sql engine... that seems better as a parse failure.

Yeah ideally we would parse statements accordingy to spec but the parser is more permissive in some cases, especially in order to minimize complexity in the parser. In this case we would avoid introducing an extra AsReturnSubquery enum variant, as well as the matches!() check that does the validation after the fact of parsing an expression which isn't ideal. So I think we'd be better off leaving the validation to the downstream crate in this scenario

Hm, yes I suppose that makes sense. It seems odd for the parser to produce output if it's known to be invalid, but I suppose you're saying, the validity question is in the semantics rather than the syntax.

Loosening the implementation to any Expr would probably mean that select .. union .. select syntax would begin parsing, which addresses my concern above.

I'll investigate...

The primary reason to have both AsReturnSubquery and AsReturnSelect is to distinguish between AS RETURN (SELECT a + b AS sum) and AS RETURN SELECT a + b AS sum.

In other PR's, such as recent work on BeginEndStatements, I believe you had recommended that with/without wrapping tokens should be distinct enum variants instead of something like present/empty AttachedTokens. Based on that guidance, it seems like it'd also fit to have two distinct enum variants here as well.

There is your additional concern regarding "subquery" vs "expr". To address this I have renamed AsReturnSubquery to AsReturnExpr & removed the non-subquery parser error.

So I think this concern is now resolved

iffyio · 2025-05-14T07:28:12Z

src/parser/mod.rs

+                    if self.peek_token() != Token::LParen {
+                        Ok(DataType::Table(None))


can we add a comment here explaining what the LParen check is for?

Done, and I simplified the if/else 👍

iffyio · 2025-05-14T07:32:38Z

src/parser/mod.rs

+                if !matches!(expr, Expr::Subquery(_)) {
+                    parser_err!(
+                        "Expected a subquery after RETURN",
+                        self.peek_token().span.start
+                    )?
+                }
+                Some(CreateFunctionBody::AsReturnSubquery(expr))


Yeah ideally we would parse statements accordingy to spec but the parser is more permissive in some cases, especially in order to minimize complexity in the parser. In this case we would avoid introducing an extra AsReturnSubquery enum variant, as well as the matches!() check that does the validation after the fact of parsing an expression which isn't ideal. So I think we'd be better off leaving the validation to the downstream crate in this scenario

aharpervc · 2025-05-15T18:02:56Z

@iffyio concerns addressed & rebased on latest master. I'm unsure how to address the linting CI job errors and I can't reproduce the report locally, so I may need some guidance on that.

- rename `AsReturn` to `AsReturnSubquery` for clarity between these two variants

- plus, flip the if/else to positive equality for simplicity

aharpervc · 2025-05-19T20:12:49Z

It seems the CI failure for the lint job was addressed by #1856. I've rebased again, all good 👍

iffyio · 2025-05-20T04:26:27Z

src/parser/mod.rs

+            if !p.peek_keyword(Keyword::TABLE) {
+                parser_err!(
+                    "Expected TABLE keyword after return type",
+                    p.peek_token().span.start
+                )?
+            }


Suggested change

if !p.peek_keyword(Keyword::TABLE) {

parser_err!(

"Expected TABLE keyword after return type",

p.peek_token().span.start

)?

}

p.expect_keyword(Keyword::TABLE)?;

The reason I didn't do it this way is because expect_keyword (and expect_keyword_is) consume the token. That causes parse_data_type to break, because it uses the TABLE keyword to understand it should parse the table data type.

However, I suppose that I could just call prev_token() after expect to undo consuming the token. I will make this change.

iffyio · 2025-05-20T04:31:05Z

src/parser/mod.rs

+                DataType::Table(maybe_table_column_defs) => match maybe_table_column_defs {
+                    Some(table_column_defs) => {
+                        if table_column_defs.is_empty() {
+                            parser_err!(
+                                "Expected table column definitions after TABLE keyword",
+                                p.peek_token().span.start
+                            )?
+                        }
+
+                        table_column_defs
+                    }
+                    None => parser_err!(
+                        "Expected table column definitions after TABLE keyword",
+                        p.peek_token().span.start
+                    )?,
+                },
+                _ => parser_err!(
+                    "Expected table data type after TABLE keyword",
+                    p.peek_token().span.start
+                )?,
+            };


Suggested change

DataType::Table(maybe_table_column_defs) => match maybe_table_column_defs {

Some(table_column_defs) => {

if table_column_defs.is_empty() {

parser_err!(

"Expected table column definitions after TABLE keyword",

p.peek_token().span.start

)?

}

table_column_defs

}

None => parser_err!(

"Expected table column definitions after TABLE keyword",

p.peek_token().span.start

)?,

},

_ => parser_err!(

"Expected table data type after TABLE keyword",

p.peek_token().span.start

)?,

};

DataType::Table(Some(table_column_defs)) !if table_column_defs.is_empty() => table_column_defs,

_ => parser_err!(

"Expected table data type after TABLE keyword",

p.peek_token().span.start

)?,

};

looks like this condition can be simplified?
Also could we add a negative test to assert the behavior on invalid data_types, noticed it seems to be lacking coverage in the PR tests

I consolidated this per your suggestion, and I added a test to assert a parser error for an incorrect table definition 👍

iffyio · 2025-05-20T04:36:22Z

src/parser/mod.rs

+                parser_err!(
+                    "Expected a subquery (or bare SELECT statement) after RETURN",
+                    self.peek_token().span.start
+                )?
+            }


can we add a test for this behavior? (e.g. one that passes a regular expression and the verifies that the parser rejects it)

aharpervc marked this pull request as ready for review May 6, 2025 19:24

aharpervc changed the title ~~Add support for table valued functions for SQL Serve~~ Add support for table valued functions for SQL Server May 6, 2025

aharpervc force-pushed the mssql-create-tvf branch 3 times, most recently from 3686b1b to db1c9b2 Compare May 6, 2025 23:37

aharpervc commented May 7, 2025

View reviewed changes

iffyio reviewed May 8, 2025

View reviewed changes

aharpervc force-pushed the mssql-create-tvf branch from 1b92ede to 999964c Compare May 9, 2025 15:10

aharpervc requested a review from iffyio May 9, 2025 15:12

iffyio reviewed May 10, 2025

View reviewed changes

iffyio reviewed May 11, 2025

View reviewed changes

iffyio reviewed May 14, 2025

View reviewed changes

aharpervc force-pushed the mssql-create-tvf branch from 88c5507 to d509d4e Compare May 15, 2025 17:53

aharpervc requested a review from iffyio May 15, 2025 18:02

aharpervc added 14 commits May 19, 2025 16:10

Add support for inline table valued functions for SQL Server

e2c8b8b

Add multi-statement table valued function support for SQL Server

0288177

Enable parsing CREATE FUNCTION without AS

25a4416

Add support for constraints in table valued function definitions

b84ec50

Corrected syntax formatting

c2fd171

Add support for un-parenthesized "RETURN SELECT" syntax

9e45f27

- rename `AsReturn` to `AsReturnSubquery` for clarity between these two variants

Simplify peek/expect with parse_keyword

10919b5

Make Table's ColumnDefs optional

6798bcc

Refactor to avoid cloning/unwrapping

846e5c9

Refactor NamedTable to be a regular struct

e36933c

Add documentation link for named tables types

0e11006

Add documentation link for RETURN SELECT

4dd05d1

Add comment to explain LParen behavior

486f85a

- plus, flip the if/else to positive equality for simplicity

Rename AsReturnSubquery to AsReturnExpr & remove non-subquery error

8bc8d38

aharpervc force-pushed the mssql-create-tvf branch from d509d4e to 8bc8d38 Compare May 19, 2025 20:10

iffyio reviewed May 20, 2025

View reviewed changes

aharpervc added 2 commits May 20, 2025 10:05

Switch from peek to expect + prev

9b6d6b1

Simplify TABLE type parser errors & add test examples for parse failures

0add6e2

		if self.peek_token() != Token::LParen {
		Ok(DataType::Table(None))

Add support for table valued functions for SQL Server #1839

Are you sure you want to change the base?

Add support for table valued functions for SQL Server #1839

Conversation

aharpervc commented May 6, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aharpervc May 7, 2025 • edited Loading

Choose a reason for hiding this comment

iffyio left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aharpervc commented May 9, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aharpervc commented May 15, 2025 • edited Loading

aharpervc commented May 19, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aharpervc commented May 6, 2025 •

edited

Loading

aharpervc May 7, 2025 •

edited

Loading

aharpervc commented May 15, 2025 •

edited

Loading