Skip to content

Stack overflow with LEAD and LAG functions #12731

Closed
@iilyak

Description

@iilyak

Describe the bug

The following query causes stack overflow in rust

SELECT seq
  FROM (
    SELECT seq,
      LEAD(seq) OVER (ORDER BY seq) AS next_seq
      FROM parquet_table
  ) AS subquery
  WHERE next_seq - seq > 1
thread 'main' has overflowed its stack
error: process didn't exit successfully: `target\debug\seq.exe` (exit code: 0xc00000fd, STATUS_STACK_OVERFLOW)

To Reproduce

use datafusion::prelude::SessionContext;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
  let ctx = datafusion::prelude::SessionContext::new();
  ctx
    .register_parquet(
      "parquet_table",
      "batches.parquet",
      datafusion::prelude::ParquetReadOptions::default(),
    )
    .await?;

  let sql_query = "
  SELECT seq
  FROM (
    SELECT seq,
      LEAD(seq) OVER (ORDER BY seq) AS next_seq
      FROM parquet_table
  ) AS subquery
  WHERE next_seq - seq > 1
  ";

  let df = ctx.sql(sql_query).await?;

  df.show().await?;
  Ok(())
}

Expected behavior

The query should detect gaps in the seq number.

Additional context

datafusion = { version = "41.0.0", features = ["parquet", "default"] }
arrow = "53.0.0"

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions