admin管理员组

文章数量:1127087

In Python-Polars, it is easy to calculate the Sum of all the lists in an array with polars.Expr.list.sum. See the example below for the sum:

df = pl.DataFrame({"values": [[[1]], [[2, 3], [5,6]]]})

df.with_columns(
    sum=pl.concat_list(pl.col("values")).list.eval(
        pl.element().list.sum()))

shape: (2, 2)
┌──────────────────┬───────────┐
│ values           ┆ sum       │
│ ---              ┆ ---       │
│ list[list[i64]]  ┆ list[i64] │
╞══════════════════╪═══════════╡
│ [[1]]            ┆ [1]       │
│ [[2, 3], [5, 6]] ┆ [5, 11]   │
└──────────────────┴───────────┘

I am trying to define the same logic for the product and the division. Since it is not available in the current version of Polars (1.19). To do this, I am using pl.reduce, but it does not seem to work as expected:

df.with_columns(
    sum=pl.concat_list(pl.col("values")).list.eval(
        pl.reduce(lambda e1, e2: e1*e2,pl.element())))

shape: (2, 2)
┌──────────────────┬──────────────────┐
│ values           ┆ sum              │
│ ---              ┆ ---              │
│ list[list[i64]]  ┆ list[list[i64]]  │
╞══════════════════╪══════════════════╡
│ [[1]]            ┆ [[1]]            │
│ [[2, 3], [5, 6]] ┆ [[2, 3], [5, 6]] │
└──────────────────┴──────────────────┘


Would you have any suggestion on how to implement the above using a single expression context?

In Python-Polars, it is easy to calculate the Sum of all the lists in an array with polars.Expr.list.sum. See the example below for the sum:

df = pl.DataFrame({"values": [[[1]], [[2, 3], [5,6]]]})

df.with_columns(
    sum=pl.concat_list(pl.col("values")).list.eval(
        pl.element().list.sum()))

shape: (2, 2)
┌──────────────────┬───────────┐
│ values           ┆ sum       │
│ ---              ┆ ---       │
│ list[list[i64]]  ┆ list[i64] │
╞══════════════════╪═══════════╡
│ [[1]]            ┆ [1]       │
│ [[2, 3], [5, 6]] ┆ [5, 11]   │
└──────────────────┴───────────┘

I am trying to define the same logic for the product and the division. Since it is not available in the current version of Polars (1.19). To do this, I am using pl.reduce, but it does not seem to work as expected:

df.with_columns(
    sum=pl.concat_list(pl.col("values")).list.eval(
        pl.reduce(lambda e1, e2: e1*e2,pl.element())))

shape: (2, 2)
┌──────────────────┬──────────────────┐
│ values           ┆ sum              │
│ ---              ┆ ---              │
│ list[list[i64]]  ┆ list[list[i64]]  │
╞══════════════════╪══════════════════╡
│ [[1]]            ┆ [[1]]            │
│ [[2, 3], [5, 6]] ┆ [[2, 3], [5, 6]] │
└──────────────────┴──────────────────┘


Would you have any suggestion on how to implement the above using a single expression context?

Share Improve this question asked Jan 8 at 18:51 yz_jcyz_jc 1537 bronze badges 1
  • btw, your sum approach can be simplified as df.with_columns(sum = pl.col.values.list.eval(pl.element().list.sum())) – roman Commented 2 days ago
Add a comment  | 

1 Answer 1

Reset to default 2
  • pl.Expr.list.eval() to get into list context.
  • pl.element() to get access to element within list context.
  • pl.Expr.product() to calculate product.
  • pl.Expr.list.first() to get the result as scalar.
df.with_columns(
    product = pl.col.values.list.eval(
        pl.element().list.eval(
            pl.element().product()
        ).list.first()
    )
)
shape: (2, 2)
┌──────────────────┬───────────┐
│ values           ┆ product   │
│ ---              ┆ ---       │
│ list[list[i64]]  ┆ list[i64] │
╞══════════════════╪═══════════╡
│ [[1]]            ┆ [1]       │
│ [[2, 3], [5, 6]] ┆ [6, 30]   │
└──────────────────┴───────────┘

本文标签: PythonPolars Expression list productStack Overflow