Pivot Transform¶
The pivot transform is, in short, a way to convert long-form data to wide-form data directly without any preprocessing (see Long-form vs. Wide-form Data for more information). Pivot transforms are useful for creating matrix or cross-tabulation data, acting as an inverse to the Fold Transform.
Here is an example, using Olympic medals data:
import altair as alt
import pandas as pd
df = pd.DataFrame.from_records([
{"country": "Norway", "type": "gold", "count": 14},
{"country": "Norway", "type": "silver", "count": 14},
{"country": "Norway", "type": "bronze", "count": 11},
{"country": "Germany", "type": "gold", "count": 14},
{"country": "Germany", "type": "silver", "count": 10},
{"country": "Germany", "type": "bronze", "count": 7},
{"country": "Canada", "type": "gold", "count": 11},
{"country": "Canada", "type": "silver", "count": 8},
{"country": "Canada", "type": "bronze", "count": 10}
])
alt.Chart(df).transform_pivot(
'type',
groupby=['country'],
value='count'
).mark_bar().encode(
x='gold:Q',
y='country:N',
)
The pivot transform, when combined with other elements of the Altair grammar, enables some very interesting chart types. For example, here we use pivot to create a single tooltip for values on multiple lines:
import altair as alt
from vega_datasets import data
source = data.stocks()
base = alt.Chart(source).encode(x='date:T')
columns = sorted(source.symbol.unique())
selection = alt.selection_single(
fields=['date'], nearest=True, on='mouseover', empty='none', clear='mouseout'
)
lines = base.mark_line().encode(y='price:Q', color='symbol:N')
points = lines.mark_point().transform_filter(selection)
rule = base.transform_pivot(
'symbol', value='price', groupby=['date']
).mark_rule().encode(
opacity=alt.condition(selection, alt.value(0.3), alt.value(0)),
tooltip=[alt.Tooltip(c, type='quantitative') for c in columns]
).add_selection(selection)
lines + points + rule
Transform Options¶
The transform_pivot()
method is built on the PivotTransform
class, which has the following options:
Property |
Type |
Description |
---|---|---|
groupby |
array( |
The optional data fields to group by. If not specified, a single group containing all data objects will be used. |
limit |
|
An optional parameter indicating the maximum number of pivoted fields to generate.
The default ( |
op |
|
The aggregation operation to apply to grouped |
pivot |
The data field to pivot on. The unique values of this field become new field names in the output stream. |
|
value |
The data field to populate pivoted fields. The aggregate values of this field become the values of the new pivoted fields. |