パーティション

指定されたパーティション分割式によってパーティション分割された新しい DataFrame を返します。結果の DataFrame はハッシュパーティション分割されます。

構文

repartition(numPartitions: Union[int, "ColumnOrName"], *cols: "ColumnOrName")

パラメーター

パラメーター	タイプ	説明
`numPartitions`	int	には、パーティションのターゲット数または列を指定する int を指定できます。列の場合は、最初のパーティション分割列として使用されます。指定しない場合は、既定のパーティション数が使用されます。
`cols`	str または Column	パーティション分割列。

返品

DataFrame: 再パーティション分割されたデータフレーム。

例示

from pyspark.sql import functions as sf
df = spark.range(0, 64, 1, 9).withColumn(
    "name", sf.concat(sf.lit("name_"), sf.col("id").cast("string"))
).withColumn(
    "age", sf.col("id") - 32
)
df.repartition(10).select(
    sf.spark_partition_id().alias("partition")
).distinct().sort("partition").show()
# +---------+
# |partition|
# +---------+
# |        0|
# ...
# |        9|
# +---------+

df.repartition(7, "age").select(
    sf.spark_partition_id().alias("partition")
).distinct().sort("partition").show()
# +---------+
# |partition|
# +---------+
# |        0|
# ...
# |        6|
# +---------+

フィードバック

このページはお役に立ちましたか?

Last updated on 2026-04-19

パーティション

構文

パラメーター

返品

例示

フィードバック

その他のリソース