Encodings

The key to creating meaningful visualizations is to map properties of the data to visual properties in order to effectively communicate information. In Altair, this mapping of visual properties to data columns is referred to as an encoding, and is most often expressed through the Chart.encode() method.

For example, here we will visualize the cars dataset using four of the available encodings: x (the x-axis value), y (the y-axis value), color (the color of the marker), and shape (the shape of the point marker):

import altair as alt
from vega_datasets import data
cars = data.cars()

alt.Chart(cars).mark_point().encode(
    x='Horsepower',
    y='Miles_per_Gallon',
    color='Origin',
    shape='Origin'
)

For data specified as a DataFrame, Altair can automatically determine the correct data type for each encoding, and creates appropriate scales and legends to represent the data.

Encoding Channels

Altair provides a number of encoding channels that can be useful in different circumstances; the following table summarizes them:

Position Channels:

Channel

Altair Class

Description

Example

x

X

The x-axis value

Simple Scatter Plot with Tooltips

y

Y

The y-axis value

Simple Scatter Plot with Tooltips

x2

X2

Second x value for ranges

Error Bars showing Confidence Interval

y2

Y2

Second y value for ranges

Line Chart with Confidence Interval Band

longitude

Longitude

Longitude for geo charts

Locations of US Airports

latitude

Latitude

Latitude for geo charts

Locations of US Airports

longitude2

Longitude2

Second longitude value for ranges

N/A

latitude2

Latitude2

Second latitude value for ranges

N/A

xError

XError

The x-axis error value

N/A

yError

YError

The y-axis error value

N/A

xError2

XError2

The second x-axis error value

N/A

yError2

YError2

The second y-axis error value

N/A

Mark Property Channels:

Channel

Altair Class

Description

Example

color

Color

The color of the mark

Simple Heatmap

fill

Fill

The fill for the mark

N/A

fillopacity

FillOpacity

The opacity of the mark’s fill

N/A

opacity

Opacity

The opacity of the mark

Horizon Graph

shape

Shape

The shape of the mark

N/A

size

Size

The size of the mark

Table Bubble Plot (Github Punch Card)

stroke

Stroke

The stroke of the mark

N/A

strokeDash

StrokeDash

The stroke dash style

Multi Series Line Chart

strokeOpacity

StrokeOpacity

The opacity of the line

N/A

strokeWidth

StrokeWidth

The width of the line

N/A

Text and Tooltip Channels:

Channel

Altair Class

Description

Example

text

Text

Text to use for the mark

Simple Scatter Plot with Labels

key

Key

N/A

tooltip

Tooltip

The tooltip value

Simple Scatter Plot with Tooltips

Hyperlink Channel:

Channel

Altair Class

Description

Example

href

Href

Hyperlink for points

N/A

Level of Detail Channel:

Channel

Altair Class

Description

Example

detail

Detail

Additional property to group by

Ranged Dot Plot

Order Channel:

Channel

Altair Class

Description

Example

order

Order

Sets the order of the marks

Connected Scatterplot (Lines with Custom Paths)

Facet Channels:

Channel

Altair Class

Description

Example

column

Column

The column of a faceted plot

Trellis Scatter Plot

row

Row

The row of a faceted plot

Becker’s Barley Trellis Plot

facet

Facet

The row and/or column of a general faceted plot

US Population: Wrapped Facet

Encoding Data Types

The details of any mapping depend on the type of the data. Altair recognizes four main data types:

Data Type

Shorthand Code

Description

quantitative

Q

a continuous real-valued quantity

ordinal

O

a discrete ordered quantity

nominal

N

a discrete unordered category

temporal

T

a time or date value

geojson

G

a geographic shape

If types are not specified for data input as a DataFrame, Altair defaults to quantitative for any numeric data, temporal for date/time data, and nominal for string data, but be aware that these defaults are by no means always the correct choice!

The types can either be expressed in a long-form using the channel encoding classes such as X and Y, or in short-form using the Shorthand Syntax discussed below. For example, the following two methods of specifying the type will lead to identical plots:

alt.Chart(cars).mark_point().encode(
    x='Acceleration:Q',
    y='Miles_per_Gallon:Q',
    color='Origin:N'
)
alt.Chart(cars).mark_point().encode(
    alt.X('Acceleration', type='quantitative'),
    alt.Y('Miles_per_Gallon', type='quantitative'),
    alt.Color('Origin', type='nominal')
)

The shorthand form, x="name:Q", is useful for its lack of boilerplate when doing quick data explorations. The long-form, alt.X('name', type='quantitative'), is useful when doing more fine-tuned adjustments to the encoding, such as binning, axis and scale properties, or more.

Specifying the correct type for your data is important, as it affects the way Altair represents your encoding in the resulting plot.

Effect of Data Type on Color Scales

As an example of this, here we will represent the same data three different ways, with the color encoded as a quantitative, ordinal, and nominal type, using three vertically-concatenated charts (see Vertical Concatenation):

base = alt.Chart(cars).mark_point().encode(
    x='Horsepower:Q',
    y='Miles_per_Gallon:Q',
).properties(
    width=150,
    height=150
)

alt.vconcat(
   base.encode(color='Cylinders:Q').properties(title='quantitative'),
   base.encode(color='Cylinders:O').properties(title='ordinal'),
   base.encode(color='Cylinders:N').properties(title='nominal'),
)

The type specification influences the way Altair, via Vega-Lite, decides on the color scale to represent the value, and influences whether a discrete or continuous legend is used.

Effect of Data Type on Axis Scales

Similarly, for x and y axis encodings, the type used for the data will affect the scales used and the characteristics of the mark. For example, here is the difference between a quantitative and ordinal scale for an column that contains integers specifying a year:

pop = data.population.url

base = alt.Chart(pop).mark_bar().encode(
    alt.Y('mean(people):Q', title='total population')
).properties(
    width=200,
    height=200
)

alt.hconcat(
    base.encode(x='year:Q').properties(title='year=quantitative'),
    base.encode(x='year:O').properties(title='year=ordinal')
)

In altair, quantitative scales always start at zero unless otherwise specified, while ordinal scales are limited to the values within the data.

Overriding the behavior of including zero in the axis, we see that even then the precise appearance of the marks representing the data are affected by the data type:

base.encode(
    alt.X('year:Q',
        scale=alt.Scale(zero=False)
    )
)

Because quantitative values do not have an inherent width, the bars do not fill the entire space between the values. This view also makes clear the missing year of data that was not immediately apparent when we treated the years as categories.

This kind of behavior is sometimes surprising to new users, but it emphasizes the importance of thinking carefully about your data types when visualizing data: a visual encoding that is suitable for categorical data may not be suitable for quantitative data, and vice versa.

Encoding Channel Options

Each encoding channel allows for a number of additional options to be expressed; these can control things like axis properties, scale properties, headers and titles, binning parameters, aggregation, sorting, and many more.

The particular options that are available vary by encoding type; the various options are listed below.

The X and Y encodings accept the following options:

Property

Type

Description

aggregate

Aggregate

Aggregation function for the field (e.g., "mean", "sum", "median", "min", "max", "count").

Default value: undefined (None)

See also: aggregate documentation.

axis

anyOf(Axis, null)

An object defining properties of axis’s gridlines, ticks and labels. If null, the axis for the encoding channel will be removed.

Default value: If undefined, default axis properties are applied.

See also: axis documentation.

band

number

For rect-based marks (rect, bar, and image), mark size relative to bandwidth of band scales or time units. If set to 1, the mark size is set to the bandwidth or the time unit interval. If set to 0.5, the mark size is half of the bandwidth or the time unit interval.

For other marks, relative position on a band of a stacked, binned, time unit or band scale. If set to 0, the marks will be positioned at the beginning of the band. If set to 0.5, the marks will be positioned in the middle of the band.

bin

anyOf(boolean, BinParams, [‘binned’], null)

A flag for binning a quantitative field, an object defining binning parameters, or indicating that the data for x or y channel are binned before they are imported into Vega-Lite ("binned").

  • If true, default binning parameters will be applied.

  • If "binned", this indicates that the data for the x (or y) channel are already binned. You can map the bin-start field to x (or y) and the bin-end field to x2 (or y2). The scale and axis will be formatted similar to binning in Vega-Lite. To adjust the axis ticks based on the bin step, you can also set the axis’s tickMinStep property.

Default value: false

See also: bin documentation.

field

Field

Required. A string defining the name of the field from which to pull a data value or an object defining iterated values from the repeat operator.

See also: field documentation.

Notes:

  1. Dots (.) and brackets ([ and ]) can be used to access nested objects (e.g., "field": "foo.bar" and "field": "foo['bar']"). If field names contain dots or brackets but are not nested, you can use \\ to escape dots and brackets (e.g., "a\\.b" and "a\\[0\\]"). See more details about escaping in the field documentation.

  2. field is not required if aggregate is count.

impute

anyOf(ImputeParams, null)

An object defining the properties of the Impute Operation to be applied. The field value of the other positional channel is taken as key of the Impute Operation. The field of the color channel if specified is used as groupby of the Impute Operation.

See also: impute documentation.

scale

anyOf(Scale, null)

An object defining properties of the channel’s scale, which is the function that transforms values in the data domain (numbers, dates, strings, etc) to visual values (pixels, colors, sizes) of the encoding channels.

If null, the scale will be disabled and the data value will be directly encoded.

Default value: If undefined, default scale properties are applied.

See also: scale documentation.

sort

Sort

Sort order for the encoded field.

For continuous fields (quantitative or temporal), sort can be either "ascending" or "descending".

For discrete fields, sort can be one of the following:

Default value: "ascending"

Note: null and sorting by another channel is not supported for row and column.

See also: sort documentation.

stack

anyOf(StackOffset, null, boolean)

Type of stacking offset if the field should be stacked. stack is only applicable for x and y channels with continuous domains. For example, stack of y can be used to customize stacking for a vertical bar chart.

stack can be one of the following values:

  • "zero" or true: stacking with baseline offset at zero value of the scale (for creating typical stacked bar and area chart).

  • "normalize" - stacking with normalized domain (for creating normalized stacked bar and area charts.
    -"center" - stacking with center baseline (for streamgraph).

  • null or false - No-stacking. This will produce layered bar and area chart.

Default value: zero for plots with all of the following conditions are true: (1) the mark is bar or area; (2) the stacked measure channel (x or y) has a linear scale; (3) At least one of non-position channels mapped to an unaggregated field that is different from x and y. Otherwise, null by default.

See also: stack documentation.

timeUnit

anyOf(TimeUnit, TimeUnitParams)

Time unit (e.g., year, yearmonth, month, hours) for a temporal field. or a temporal field that gets casted as ordinal.

Default value: undefined (None)

See also: timeUnit documentation.

title

anyOf(Text, null)

A title for the field. If null, the title will be removed.

Default value: derived from the field’s name and transformation function (aggregate, bin and timeUnit). If the field has an aggregate function, the function is displayed as part of the title (e.g., "Sum of Profit"). If the field is binned or has a time unit applied, the applied function is shown in parentheses (e.g., "Profit (binned)", "Transaction Date (year-month)"). Otherwise, the title is simply the field name.

Notes:

  1. You can customize the default field title format by providing the fieldTitle property in the config or fieldTitle function via the compile function’s options.

  2. If both field definition’s title and axis, header, or legend title are defined, axis/header/legend title will be used.

type

StandardType

The encoded field’s type of measurement ("quantitative", "temporal", "ordinal", or "nominal"). It can also be a "geojson" type for encoding ‘geoshape’.

Note:

  • Data values for a temporal field can be either a date-time string (e.g., "2015-03-07 12:32:17", "17:01", "2015-03-16". "2015") or a timestamp number (e.g., 1552199579097).

  • Data type describes the semantics of the data rather than the primitive data types (number, string, etc.). The same primitive data type can have different types of measurement. For example, numeric data can represent quantitative, ordinal, or nominal data.

  • When using with bin, the type property can be either "quantitative" (for using a linear bin scale) or "ordinal" (for using an ordinal bin scale).

  • When using with timeUnit, the type property can be either "temporal" (for using a temporal scale) or "ordinal" (for using an ordinal scale).

  • When using with aggregate, the type property refers to the post-aggregation data type. For example, we can calculate count distinct of a categorical field "cat" using {"aggregate": "distinct", "field": "cat", "type": "quantitative"}. The "type" of the aggregate output is "quantitative".

  • Secondary channels (e.g., x2, y2, xError, yError) do not have type as they have exactly the same type as their primary channels (e.g., x, y).

See also: type documentation.

The Color, Fill, and Stroke encodings accept the following options:

Property

Type

Description

aggregate

Aggregate

Aggregation function for the field (e.g., "mean", "sum", "median", "min", "max", "count").

Default value: undefined (None)

See also: aggregate documentation.

bin

anyOf(boolean, BinParams, [‘binned’], null)

A flag for binning a quantitative field, an object defining binning parameters, or indicating that the data for x or y channel are binned before they are imported into Vega-Lite ("binned").

  • If true, default binning parameters will be applied.

  • If "binned", this indicates that the data for the x (or y) channel are already binned. You can map the bin-start field to x (or y) and the bin-end field to x2 (or y2). The scale and axis will be formatted similar to binning in Vega-Lite. To adjust the axis ticks based on the bin step, you can also set the axis’s tickMinStep property.

Default value: false

See also: bin documentation.

condition

ValueCondition<string>

One or more value definition(s) with a selection or a test predicate.

Note: A field definition’s condition property can only contain conditional value definitions since Vega-Lite only allows at most one encoded field per encoding channel.

field

Field

Required. A string defining the name of the field from which to pull a data value or an object defining iterated values from the repeat operator.

See also: field documentation.

Notes:

  1. Dots (.) and brackets ([ and ]) can be used to access nested objects (e.g., "field": "foo.bar" and "field": "foo['bar']"). If field names contain dots or brackets but are not nested, you can use \\ to escape dots and brackets (e.g., "a\\.b" and "a\\[0\\]"). See more details about escaping in the field documentation.

  2. field is not required if aggregate is count.

format

anyOf(string, dict)

When used with the default "number" and "time" format type, the text formatting pattern for labels of guides (axes, legends, headers) and text marks.

See the format documentation for more examples.

When used with a custom "formatType" that takes datum.value and format parameter as input), this property represents the format parameter.

Default value: Derived from numberFormat config for number format and from timeFormat config for time format.

formatType

string

The format type for labels ("number" or "time" or a registered custom format type).

Default value:

  • "time" for temporal fields and ordinal and nomimal fields with timeUnit.

  • "number" for quantitative fields as well as ordinal and nomimal fields without timeUnit.

labelExpr

string

Vega expression for customizing labels text.

Note: The label text and value can be assessed via the label and value properties of the axis’s backing datum object.

timeUnit

anyOf(TimeUnit, TimeUnitParams)

Time unit (e.g., year, yearmonth, month, hours) for a temporal field. or a temporal field that gets casted as ordinal.

Default value: undefined (None)

See also: timeUnit documentation.

title

anyOf(Text, null)

A title for the field. If null, the title will be removed.

Default value: derived from the field’s name and transformation function (aggregate, bin and timeUnit). If the field has an aggregate function, the function is displayed as part of the title (e.g., "Sum of Profit"). If the field is binned or has a time unit applied, the applied function is shown in parentheses (e.g., "Profit (binned)", "Transaction Date (year-month)"). Otherwise, the title is simply the field name.

Notes:

  1. You can customize the default field title format by providing the fieldTitle property in the config or fieldTitle function via the compile function’s options.

  2. If both field definition’s title and axis, header, or legend title are defined, axis/header/legend title will be used.

type

StandardType

The encoded field’s type of measurement ("quantitative", "temporal", "ordinal", or "nominal"). It can also be a "geojson" type for encoding ‘geoshape’.

Note:

  • Data values for a temporal field can be either a date-time string (e.g., "2015-03-07 12:32:17", "17:01", "2015-03-16". "2015") or a timestamp number (e.g., 1552199579097).

  • Data type describes the semantics of the data rather than the primitive data types (number, string, etc.). The same primitive data type can have different types of measurement. For example, numeric data can represent quantitative, ordinal, or nominal data.

  • When using with bin, the type property can be either "quantitative" (for using a linear bin scale) or "ordinal" (for using an ordinal bin scale).

  • When using with timeUnit, the type property can be either "temporal" (for using a temporal scale) or "ordinal" (for using an ordinal scale).

  • When using with aggregate, the type property refers to the post-aggregation data type. For example, we can calculate count distinct of a categorical field "cat" using {"aggregate": "distinct", "field": "cat", "type": "quantitative"}. The "type" of the aggregate output is "quantitative".

  • Secondary channels (e.g., x2, y2, xError, yError) do not have type as they have exactly the same type as their primary channels (e.g., x, y).

See also: type documentation.

The Shape encoding accepts the following options:

Property

Type

Description

aggregate

Aggregate

Aggregation function for the field (e.g., "mean", "sum", "median", "min", "max", "count").

Default value: undefined (None)

See also: aggregate documentation.

bin

anyOf(boolean, BinParams, null)

A flag for binning a quantitative field, an object defining binning parameters, or indicating that the data for x or y channel are binned before they are imported into Vega-Lite ("binned").

  • If true, default binning parameters will be applied.

  • If "binned", this indicates that the data for the x (or y) channel are already binned. You can map the bin-start field to x (or y) and the bin-end field to x2 (or y2). The scale and axis will be formatted similar to binning in Vega-Lite. To adjust the axis ticks based on the bin step, you can also set the axis’s tickMinStep property.

Default value: false

See also: bin documentation.

condition

ValueCondition<(string|null)>

One or more value definition(s) with a selection or a test predicate.

Note: A field definition’s condition property can only contain conditional value definitions since Vega-Lite only allows at most one encoded field per encoding channel.

field

Field

Required. A string defining the name of the field from which to pull a data value or an object defining iterated values from the repeat operator.

See also: field documentation.

Notes:

  1. Dots (.) and brackets ([ and ]) can be used to access nested objects (e.g., "field": "foo.bar" and "field": "foo['bar']"). If field names contain dots or brackets but are not nested, you can use \\ to escape dots and brackets (e.g., "a\\.b" and "a\\[0\\]"). See more details about escaping in the field documentation.

  2. field is not required if aggregate is count.

legend

anyOf(Legend, null)

An object defining properties of the legend. If null, the legend for the encoding channel will be removed.

Default value: If undefined, default legend properties are applied.

See also: legend documentation.

scale

anyOf(Scale, null)

An object defining properties of the channel’s scale, which is the function that transforms values in the data domain (numbers, dates, strings, etc) to visual values (pixels, colors, sizes) of the encoding channels.

If null, the scale will be disabled and the data value will be directly encoded.

Default value: If undefined, default scale properties are applied.

See also: scale documentation.

sort

Sort

Sort order for the encoded field.

For continuous fields (quantitative or temporal), sort can be either "ascending" or "descending".

For discrete fields, sort can be one of the following:

Default value: "ascending"

Note: null and sorting by another channel is not supported for row and column.

See also: sort documentation.

timeUnit

anyOf(TimeUnit, TimeUnitParams)

Time unit (e.g., year, yearmonth, month, hours) for a temporal field. or a temporal field that gets casted as ordinal.

Default value: undefined (None)

See also: timeUnit documentation.

title

anyOf(Text, null)

A title for the field. If null, the title will be removed.

Default value: derived from the field’s name and transformation function (aggregate, bin and timeUnit). If the field has an aggregate function, the function is displayed as part of the title (e.g., "Sum of Profit"). If the field is binned or has a time unit applied, the applied function is shown in parentheses (e.g., "Profit (binned)", "Transaction Date (year-month)"). Otherwise, the title is simply the field name.

Notes:

  1. You can customize the default field title format by providing the fieldTitle property in the config or fieldTitle function via the compile function’s options.

  2. If both field definition’s title and axis, header, or legend title are defined, axis/header/legend title will be used.

type

TypeForShape

The encoded field’s type of measurement ("quantitative", "temporal", "ordinal", or "nominal"). It can also be a "geojson" type for encoding ‘geoshape’.

Note:

  • Data values for a temporal field can be either a date-time string (e.g., "2015-03-07 12:32:17", "17:01", "2015-03-16". "2015") or a timestamp number (e.g., 1552199579097).

  • Data type describes the semantics of the data rather than the primitive data types (number, string, etc.). The same primitive data type can have different types of measurement. For example, numeric data can represent quantitative, ordinal, or nominal data.

  • When using with bin, the type property can be either "quantitative" (for using a linear bin scale) or "ordinal" (for using an ordinal bin scale).

  • When using with timeUnit, the type property can be either "temporal" (for using a temporal scale) or "ordinal" (for using an ordinal scale).

  • When using with aggregate, the type property refers to the post-aggregation data type. For example, we can calculate count distinct of a categorical field "cat" using {"aggregate": "distinct", "field": "cat", "type": "quantitative"}. The "type" of the aggregate output is "quantitative".

  • Secondary channels (e.g., x2, y2, xError, yError) do not have type as they have exactly the same type as their primary channels (e.g., x, y).

See also: type documentation.

The FillOpacity, Opacity, Size, StrokeOpacity, and StrokeWidth encodings accept the following options:

Property

Type

Description

aggregate

Aggregate

Aggregation function for the field (e.g., "mean", "sum", "median", "min", "max", "count").

Default value: undefined (None)

See also: aggregate documentation.

bin

anyOf(boolean, BinParams, null)

A flag for binning a quantitative field, an object defining binning parameters, or indicating that the data for x or y channel are binned before they are imported into Vega-Lite ("binned").

  • If true, default binning parameters will be applied.

  • If "binned", this indicates that the data for the x (or y) channel are already binned. You can map the bin-start field to x (or y) and the bin-end field to x2 (or y2). The scale and axis will be formatted similar to binning in Vega-Lite. To adjust the axis ticks based on the bin step, you can also set the axis’s tickMinStep property.

Default value: false

See also: bin documentation.

condition

ValueCondition<number>

One or more value definition(s) with a selection or a test predicate.

Note: A field definition’s condition property can only contain conditional value definitions since Vega-Lite only allows at most one encoded field per encoding channel.

field

Field

Required. A string defining the name of the field from which to pull a data value or an object defining iterated values from the repeat operator.

See also: field documentation.

Notes:

  1. Dots (.) and brackets ([ and ]) can be used to access nested objects (e.g., "field": "foo.bar" and "field": "foo['bar']"). If field names contain dots or brackets but are not nested, you can use \\ to escape dots and brackets (e.g., "a\\.b" and "a\\[0\\]"). See more details about escaping in the field documentation.

  2. field is not required if aggregate is count.

legend

anyOf(Legend, null)

An object defining properties of the legend. If null, the legend for the encoding channel will be removed.

Default value: If undefined, default legend properties are applied.

See also: legend documentation.

scale

anyOf(Scale, null)

An object defining properties of the channel’s scale, which is the function that transforms values in the data domain (numbers, dates, strings, etc) to visual values (pixels, colors, sizes) of the encoding channels.

If null, the scale will be disabled and the data value will be directly encoded.

Default value: If undefined, default scale properties are applied.

See also: scale documentation.

sort

Sort

Sort order for the encoded field.

For continuous fields (quantitative or temporal), sort can be either "ascending" or "descending".

For discrete fields, sort can be one of the following:

Default value: "ascending"

Note: null and sorting by another channel is not supported for row and column.

See also: sort documentation.

timeUnit

anyOf(TimeUnit, TimeUnitParams)

Time unit (e.g., year, yearmonth, month, hours) for a temporal field. or a temporal field that gets casted as ordinal.

Default value: undefined (None)

See also: timeUnit documentation.

title

anyOf(Text, null)

A title for the field. If null, the title will be removed.

Default value: derived from the field’s name and transformation function (aggregate, bin and timeUnit). If the field has an aggregate function, the function is displayed as part of the title (e.g., "Sum of Profit"). If the field is binned or has a time unit applied, the applied function is shown in parentheses (e.g., "Profit (binned)", "Transaction Date (year-month)"). Otherwise, the title is simply the field name.

Notes:

  1. You can customize the default field title format by providing the fieldTitle property in the config or fieldTitle function via the compile function’s options.

  2. If both field definition’s title and axis, header, or legend title are defined, axis/header/legend title will be used.

type

StandardType

The encoded field’s type of measurement ("quantitative", "temporal", "ordinal", or "nominal"). It can also be a "geojson" type for encoding ‘geoshape’.

Note:

  • Data values for a temporal field can be either a date-time string (e.g., "2015-03-07 12:32:17", "17:01", "2015-03-16". "2015") or a timestamp number (e.g., 1552199579097).

  • Data type describes the semantics of the data rather than the primitive data types (number, string, etc.). The same primitive data type can have different types of measurement. For example, numeric data can represent quantitative, ordinal, or nominal data.

  • When using with bin, the type property can be either "quantitative" (for using a linear bin scale) or "ordinal" (for using an ordinal bin scale).

  • When using with timeUnit, the type property can be either "temporal" (for using a temporal scale) or "ordinal" (for using an ordinal scale).

  • When using with aggregate, the type property refers to the post-aggregation data type. For example, we can calculate count distinct of a categorical field "cat" using {"aggregate": "distinct", "field": "cat", "type": "quantitative"}. The "type" of the aggregate output is "quantitative".

  • Secondary channels (e.g., x2, y2, xError, yError) do not have type as they have exactly the same type as their primary channels (e.g., x, y).

See also: type documentation.

The Row, Column, and Facet encodings accept the following options:

Property

Type

Description

aggregate

Aggregate

Aggregation function for the field (e.g., "mean", "sum", "median", "min", "max", "count").

Default value: undefined (None)

See also: aggregate documentation.

bin

anyOf(boolean, BinParams, null)

A flag for binning a quantitative field, an object defining binning parameters, or indicating that the data for x or y channel are binned before they are imported into Vega-Lite ("binned").

  • If true, default binning parameters will be applied.

  • If "binned", this indicates that the data for the x (or y) channel are already binned. You can map the bin-start field to x (or y) and the bin-end field to x2 (or y2). The scale and axis will be formatted similar to binning in Vega-Lite. To adjust the axis ticks based on the bin step, you can also set the axis’s tickMinStep property.

Default value: false

See also: bin documentation.

field

Field

Required. A string defining the name of the field from which to pull a data value or an object defining iterated values from the repeat operator.

See also: field documentation.

Notes:

  1. Dots (.) and brackets ([ and ]) can be used to access nested objects (e.g., "field": "foo.bar" and "field": "foo['bar']"). If field names contain dots or brackets but are not nested, you can use \\ to escape dots and brackets (e.g., "a\\.b" and "a\\[0\\]"). See more details about escaping in the field documentation.

  2. field is not required if aggregate is count.

header

Header

An object defining properties of a facet’s header.

sort

anyOf(SortArray, SortOrder, EncodingSortField, null)

Sort order for the encoded field.

For continuous fields (quantitative or temporal), sort can be either "ascending" or "descending".

For discrete fields, sort can be one of the following:

  • "ascending" or "descending" – for sorting by the values’ natural order in JavaScript.

  • A sort field definition for sorting by another field.

  • An array specifying the field values in preferred order. In this case, the sort order will obey the values in the array, followed by any unspecified values in their original order. For discrete time field, values in the sort array can be date-time definition objects. In addition, for time units "month" and "day", the values can be the month or day names (case insensitive) or their 3-letter initials (e.g., "Mon", "Tue").

  • null indicating no sort.

Default value: "ascending"

Note: null is not supported for row and column.

timeUnit

anyOf(TimeUnit, TimeUnitParams)

Time unit (e.g., year, yearmonth, month, hours) for a temporal field. or a temporal field that gets casted as ordinal.

Default value: undefined (None)

See also: timeUnit documentation.

title

anyOf(Text, null)

A title for the field. If null, the title will be removed.

Default value: derived from the field’s name and transformation function (aggregate, bin and timeUnit). If the field has an aggregate function, the function is displayed as part of the title (e.g., "Sum of Profit"). If the field is binned or has a time unit applied, the applied function is shown in parentheses (e.g., "Profit (binned)", "Transaction Date (year-month)"). Otherwise, the title is simply the field name.

Notes:

  1. You can customize the default field title format by providing the fieldTitle property in the config or fieldTitle function via the compile function’s options.

  2. If both field definition’s title and axis, header, or legend title are defined, axis/header/legend title will be used.

type

StandardType

The encoded field’s type of measurement ("quantitative", "temporal", "ordinal", or "nominal"). It can also be a "geojson" type for encoding ‘geoshape’.

Note:

  • Data values for a temporal field can be either a date-time string (e.g., "2015-03-07 12:32:17", "17:01", "2015-03-16". "2015") or a timestamp number (e.g., 1552199579097).

  • Data type describes the semantics of the data rather than the primitive data types (number, string, etc.). The same primitive data type can have different types of measurement. For example, numeric data can represent quantitative, ordinal, or nominal data.

  • When using with bin, the type property can be either "quantitative" (for using a linear bin scale) or "ordinal" (for using an ordinal bin scale).

  • When using with timeUnit, the type property can be either "temporal" (for using a temporal scale) or "ordinal" (for using an ordinal scale).

  • When using with aggregate, the type property refers to the post-aggregation data type. For example, we can calculate count distinct of a categorical field "cat" using {"aggregate": "distinct", "field": "cat", "type": "quantitative"}. The "type" of the aggregate output is "quantitative".

  • Secondary channels (e.g., x2, y2, xError, yError) do not have type as they have exactly the same type as their primary channels (e.g., x, y).

See also: type documentation.

The Text and Tooltip encodings accept the following options:

Property

Type

Description

aggregate

Aggregate

Aggregation function for the field (e.g., "mean", "sum", "median", "min", "max", "count").

Default value: undefined (None)

See also: aggregate documentation.

bin

anyOf(boolean, BinParams, [‘binned’], null)

A flag for binning a quantitative field, an object defining binning parameters, or indicating that the data for x or y channel are binned before they are imported into Vega-Lite ("binned").

  • If true, default binning parameters will be applied.

  • If "binned", this indicates that the data for the x (or y) channel are already binned. You can map the bin-start field to x (or y) and the bin-end field to x2 (or y2). The scale and axis will be formatted similar to binning in Vega-Lite. To adjust the axis ticks based on the bin step, you can also set the axis’s tickMinStep property.

Default value: false

See also: bin documentation.

condition

ValueCondition<Text>

One or more value definition(s) with a selection or a test predicate.

Note: A field definition’s condition property can only contain conditional value definitions since Vega-Lite only allows at most one encoded field per encoding channel.

field

Field

Required. A string defining the name of the field from which to pull a data value or an object defining iterated values from the repeat operator.

See also: field documentation.

Notes:

  1. Dots (.) and brackets ([ and ]) can be used to access nested objects (e.g., "field": "foo.bar" and "field": "foo['bar']"). If field names contain dots or brackets but are not nested, you can use \\ to escape dots and brackets (e.g., "a\\.b" and "a\\[0\\]"). See more details about escaping in the field documentation.

  2. field is not required if aggregate is count.

format

anyOf(string, dict)

When used with the default "number" and "time" format type, the text formatting pattern for labels of guides (axes, legends, headers) and text marks.

See the format documentation for more examples.

When used with a custom "formatType" that takes datum.value and format parameter as input), this property represents the format parameter.

Default value: Derived from numberFormat config for number format and from timeFormat config for time format.

formatType

string

The format type for labels ("number" or "time" or a registered custom format type).

Default value:

  • "time" for temporal fields and ordinal and nomimal fields with timeUnit.

  • "number" for quantitative fields as well as ordinal and nomimal fields without timeUnit.

labelExpr

string

Vega expression for customizing labels text.

Note: The label text and value can be assessed via the label and value properties of the axis’s backing datum object.

timeUnit

anyOf(TimeUnit, TimeUnitParams)

Time unit (e.g., year, yearmonth, month, hours) for a temporal field. or a temporal field that gets casted as ordinal.

Default value: undefined (None)

See also: timeUnit documentation.

title

anyOf(Text, null)

A title for the field. If null, the title will be removed.

Default value: derived from the field’s name and transformation function (aggregate, bin and timeUnit). If the field has an aggregate function, the function is displayed as part of the title (e.g., "Sum of Profit"). If the field is binned or has a time unit applied, the applied function is shown in parentheses (e.g., "Profit (binned)", "Transaction Date (year-month)"). Otherwise, the title is simply the field name.

Notes:

  1. You can customize the default field title format by providing the fieldTitle property in the config or fieldTitle function via the compile function’s options.

  2. If both field definition’s title and axis, header, or legend title are defined, axis/header/legend title will be used.

type

StandardType

The encoded field’s type of measurement ("quantitative", "temporal", "ordinal", or "nominal"). It can also be a "geojson" type for encoding ‘geoshape’.

Note:

  • Data values for a temporal field can be either a date-time string (e.g., "2015-03-07 12:32:17", "17:01", "2015-03-16". "2015") or a timestamp number (e.g., 1552199579097).

  • Data type describes the semantics of the data rather than the primitive data types (number, string, etc.). The same primitive data type can have different types of measurement. For example, numeric data can represent quantitative, ordinal, or nominal data.

  • When using with bin, the type property can be either "quantitative" (for using a linear bin scale) or "ordinal" (for using an ordinal bin scale).

  • When using with timeUnit, the type property can be either "temporal" (for using a temporal scale) or "ordinal" (for using an ordinal scale).

  • When using with aggregate, the type property refers to the post-aggregation data type. For example, we can calculate count distinct of a categorical field "cat" using {"aggregate": "distinct", "field": "cat", "type": "quantitative"}. The "type" of the aggregate output is "quantitative".

  • Secondary channels (e.g., x2, y2, xError, yError) do not have type as they have exactly the same type as their primary channels (e.g., x, y).

See also: type documentation.

The Detail and Key encodings accept the following options:

Property

Type

Description

aggregate

Aggregate

Aggregation function for the field (e.g., "mean", "sum", "median", "min", "max", "count").

Default value: undefined (None)

See also: aggregate documentation.

bin

anyOf(boolean, BinParams, [‘binned’], null)

A flag for binning a quantitative field, an object defining binning parameters, or indicating that the data for x or y channel are binned before they are imported into Vega-Lite ("binned").

  • If true, default binning parameters will be applied.

  • If "binned", this indicates that the data for the x (or y) channel are already binned. You can map the bin-start field to x (or y) and the bin-end field to x2 (or y2). The scale and axis will be formatted similar to binning in Vega-Lite. To adjust the axis ticks based on the bin step, you can also set the axis’s tickMinStep property.

Default value: false

See also: bin documentation.

field

Field

Required. A string defining the name of the field from which to pull a data value or an object defining iterated values from the repeat operator.

See also: field documentation.

Notes:

  1. Dots (.) and brackets ([ and ]) can be used to access nested objects (e.g., "field": "foo.bar" and "field": "foo['bar']"). If field names contain dots or brackets but are not nested, you can use \\ to escape dots and brackets (e.g., "a\\.b" and "a\\[0\\]"). See more details about escaping in the field documentation.

  2. field is not required if aggregate is count.

timeUnit

anyOf(TimeUnit, TimeUnitParams)

Time unit (e.g., year, yearmonth, month, hours) for a temporal field. or a temporal field that gets casted as ordinal.

Default value: undefined (None)

See also: timeUnit documentation.

title

anyOf(Text, null)

A title for the field. If null, the title will be removed.

Default value: derived from the field’s name and transformation function (aggregate, bin and timeUnit). If the field has an aggregate function, the function is displayed as part of the title (e.g., "Sum of Profit"). If the field is binned or has a time unit applied, the applied function is shown in parentheses (e.g., "Profit (binned)", "Transaction Date (year-month)"). Otherwise, the title is simply the field name.

Notes:

  1. You can customize the default field title format by providing the fieldTitle property in the config or fieldTitle function via the compile function’s options.

  2. If both field definition’s title and axis, header, or legend title are defined, axis/header/legend title will be used.

type

StandardType

The encoded field’s type of measurement ("quantitative", "temporal", "ordinal", or "nominal"). It can also be a "geojson" type for encoding ‘geoshape’.

Note:

  • Data values for a temporal field can be either a date-time string (e.g., "2015-03-07 12:32:17", "17:01", "2015-03-16". "2015") or a timestamp number (e.g., 1552199579097).

  • Data type describes the semantics of the data rather than the primitive data types (number, string, etc.). The same primitive data type can have different types of measurement. For example, numeric data can represent quantitative, ordinal, or nominal data.

  • When using with bin, the type property can be either "quantitative" (for using a linear bin scale) or "ordinal" (for using an ordinal bin scale).

  • When using with timeUnit, the type property can be either "temporal" (for using a temporal scale) or "ordinal" (for using an ordinal scale).

  • When using with aggregate, the type property refers to the post-aggregation data type. For example, we can calculate count distinct of a categorical field "cat" using {"aggregate": "distinct", "field": "cat", "type": "quantitative"}. The "type" of the aggregate output is "quantitative".

  • Secondary channels (e.g., x2, y2, xError, yError) do not have type as they have exactly the same type as their primary channels (e.g., x, y).

See also: type documentation.

The Latitude and Longitude encodings accept the following options:

Property

Type

Description

aggregate

Aggregate

Aggregation function for the field (e.g., "mean", "sum", "median", "min", "max", "count").

Default value: undefined (None)

See also: aggregate documentation.

bin

null

A flag for binning a quantitative field, an object defining binning parameters, or indicating that the data for x or y channel are binned before they are imported into Vega-Lite ("binned").

  • If true, default binning parameters will be applied.

  • If "binned", this indicates that the data for the x (or y) channel are already binned. You can map the bin-start field to x (or y) and the bin-end field to x2 (or y2). The scale and axis will be formatted similar to binning in Vega-Lite. To adjust the axis ticks based on the bin step, you can also set the axis’s tickMinStep property.

Default value: false

See also: bin documentation.

field

Field

Required. A string defining the name of the field from which to pull a data value or an object defining iterated values from the repeat operator.

See also: field documentation.

Notes:

  1. Dots (.) and brackets ([ and ]) can be used to access nested objects (e.g., "field": "foo.bar" and "field": "foo['bar']"). If field names contain dots or brackets but are not nested, you can use \\ to escape dots and brackets (e.g., "a\\.b" and "a\\[0\\]"). See more details about escaping in the field documentation.

  2. field is not required if aggregate is count.

timeUnit

anyOf(TimeUnit, TimeUnitParams)

Time unit (e.g., year, yearmonth, month, hours) for a temporal field. or a temporal field that gets casted as ordinal.

Default value: undefined (None)

See also: timeUnit documentation.

title

anyOf(Text, null)

A title for the field. If null, the title will be removed.

Default value: derived from the field’s name and transformation function (aggregate, bin and timeUnit). If the field has an aggregate function, the function is displayed as part of the title (e.g., "Sum of Profit"). If the field is binned or has a time unit applied, the applied function is shown in parentheses (e.g., "Profit (binned)", "Transaction Date (year-month)"). Otherwise, the title is simply the field name.

Notes:

  1. You can customize the default field title format by providing the fieldTitle property in the config or fieldTitle function via the compile function’s options.

  2. If both field definition’s title and axis, header, or legend title are defined, axis/header/legend title will be used.

type

[‘quantitative’]

The encoded field’s type of measurement ("quantitative", "temporal", "ordinal", or "nominal"). It can also be a "geojson" type for encoding ‘geoshape’.

Note:

  • Data values for a temporal field can be either a date-time string (e.g., "2015-03-07 12:32:17", "17:01", "2015-03-16". "2015") or a timestamp number (e.g., 1552199579097).

  • Data type describes the semantics of the data rather than the primitive data types (number, string, etc.). The same primitive data type can have different types of measurement. For example, numeric data can represent quantitative, ordinal, or nominal data.

  • When using with bin, the type property can be either "quantitative" (for using a linear bin scale) or "ordinal" (for using an ordinal bin scale).

  • When using with timeUnit, the type property can be either "temporal" (for using a temporal scale) or "ordinal" (for using an ordinal scale).

  • When using with aggregate, the type property refers to the post-aggregation data type. For example, we can calculate count distinct of a categorical field "cat" using {"aggregate": "distinct", "field": "cat", "type": "quantitative"}. The "type" of the aggregate output is "quantitative".

  • Secondary channels (e.g., x2, y2, xError, yError) do not have type as they have exactly the same type as their primary channels (e.g., x, y).

See also: type documentation.

The Latitude2, Longitude2, X2, Y2, XError, YError, XError2, and YError2 encodings accept the following options:

Property

Type

Description

aggregate

Aggregate

Aggregation function for the field (e.g., "mean", "sum", "median", "min", "max", "count").

Default value: undefined (None)

See also: aggregate documentation.

bin

null

A flag for binning a quantitative field, an object defining binning parameters, or indicating that the data for x or y channel are binned before they are imported into Vega-Lite ("binned").

  • If true, default binning parameters will be applied.

  • If "binned", this indicates that the data for the x (or y) channel are already binned. You can map the bin-start field to x (or y) and the bin-end field to x2 (or y2). The scale and axis will be formatted similar to binning in Vega-Lite. To adjust the axis ticks based on the bin step, you can also set the axis’s tickMinStep property.

Default value: false

See also: bin documentation.

field

Field

Required. A string defining the name of the field from which to pull a data value or an object defining iterated values from the repeat operator.

See also: field documentation.

Notes:

  1. Dots (.) and brackets ([ and ]) can be used to access nested objects (e.g., "field": "foo.bar" and "field": "foo['bar']"). If field names contain dots or brackets but are not nested, you can use \\ to escape dots and brackets (e.g., "a\\.b" and "a\\[0\\]"). See more details about escaping in the field documentation.

  2. field is not required if aggregate is count.

timeUnit

anyOf(TimeUnit, TimeUnitParams)

Time unit (e.g., year, yearmonth, month, hours) for a temporal field. or a temporal field that gets casted as ordinal.

Default value: undefined (None)

See also: timeUnit documentation.

title

anyOf(Text, null)

A title for the field. If null, the title will be removed.

Default value: derived from the field’s name and transformation function (aggregate, bin and timeUnit). If the field has an aggregate function, the function is displayed as part of the title (e.g., "Sum of Profit"). If the field is binned or has a time unit applied, the applied function is shown in parentheses (e.g., "Profit (binned)", "Transaction Date (year-month)"). Otherwise, the title is simply the field name.

Notes:

  1. You can customize the default field title format by providing the fieldTitle property in the config or fieldTitle function via the compile function’s options.

  2. If both field definition’s title and axis, header, or legend title are defined, axis/header/legend title will be used.

The Href encoding accepts the following options:

Property

Type

Description

aggregate

Aggregate

Aggregation function for the field (e.g., "mean", "sum", "median", "min", "max", "count").

Default value: undefined (None)

See also: aggregate documentation.

bin

anyOf(boolean, BinParams, [‘binned’], null)

A flag for binning a quantitative field, an object defining binning parameters, or indicating that the data for x or y channel are binned before they are imported into Vega-Lite ("binned").

  • If true, default binning parameters will be applied.

  • If "binned", this indicates that the data for the x (or y) channel are already binned. You can map the bin-start field to x (or y) and the bin-end field to x2 (or y2). The scale and axis will be formatted similar to binning in Vega-Lite. To adjust the axis ticks based on the bin step, you can also set the axis’s tickMinStep property.

Default value: false

See also: bin documentation.

condition

ValueCondition<Text>

One or more value definition(s) with a selection or a test predicate.

Note: A field definition’s condition property can only contain conditional value definitions since Vega-Lite only allows at most one encoded field per encoding channel.

field

Field

Required. A string defining the name of the field from which to pull a data value or an object defining iterated values from the repeat operator.

See also: field documentation.

Notes:

  1. Dots (.) and brackets ([ and ]) can be used to access nested objects (e.g., "field": "foo.bar" and "field": "foo['bar']"). If field names contain dots or brackets but are not nested, you can use \\ to escape dots and brackets (e.g., "a\\.b" and "a\\[0\\]"). See more details about escaping in the field documentation.

  2. field is not required if aggregate is count.

format

anyOf(string, dict)

When used with the default "number" and "time" format type, the text formatting pattern for labels of guides (axes, legends, headers) and text marks.

See the format documentation for more examples.

When used with a custom "formatType" that takes datum.value and format parameter as input), this property represents the format parameter.

Default value: Derived from numberFormat config for number format and from timeFormat config for time format.

formatType

string

The format type for labels ("number" or "time" or a registered custom format type).

Default value:

  • "time" for temporal fields and ordinal and nomimal fields with timeUnit.

  • "number" for quantitative fields as well as ordinal and nomimal fields without timeUnit.

labelExpr

string

Vega expression for customizing labels text.

Note: The label text and value can be assessed via the label and value properties of the axis’s backing datum object.

timeUnit

anyOf(TimeUnit, TimeUnitParams)

Time unit (e.g., year, yearmonth, month, hours) for a temporal field. or a temporal field that gets casted as ordinal.

Default value: undefined (None)

See also: timeUnit documentation.

title

anyOf(Text, null)

A title for the field. If null, the title will be removed.

Default value: derived from the field’s name and transformation function (aggregate, bin and timeUnit). If the field has an aggregate function, the function is displayed as part of the title (e.g., "Sum of Profit"). If the field is binned or has a time unit applied, the applied function is shown in parentheses (e.g., "Profit (binned)", "Transaction Date (year-month)"). Otherwise, the title is simply the field name.

Notes:

  1. You can customize the default field title format by providing the fieldTitle property in the config or fieldTitle function via the compile function’s options.

  2. If both field definition’s title and axis, header, or legend title are defined, axis/header/legend title will be used.

type

StandardType

The encoded field’s type of measurement ("quantitative", "temporal", "ordinal", or "nominal"). It can also be a "geojson" type for encoding ‘geoshape’.

Note:

  • Data values for a temporal field can be either a date-time string (e.g., "2015-03-07 12:32:17", "17:01", "2015-03-16". "2015") or a timestamp number (e.g., 1552199579097).

  • Data type describes the semantics of the data rather than the primitive data types (number, string, etc.). The same primitive data type can have different types of measurement. For example, numeric data can represent quantitative, ordinal, or nominal data.

  • When using with bin, the type property can be either "quantitative" (for using a linear bin scale) or "ordinal" (for using an ordinal bin scale).

  • When using with timeUnit, the type property can be either "temporal" (for using a temporal scale) or "ordinal" (for using an ordinal scale).

  • When using with aggregate, the type property refers to the post-aggregation data type. For example, we can calculate count distinct of a categorical field "cat" using {"aggregate": "distinct", "field": "cat", "type": "quantitative"}. The "type" of the aggregate output is "quantitative".

  • Secondary channels (e.g., x2, y2, xError, yError) do not have type as they have exactly the same type as their primary channels (e.g., x, y).

See also: type documentation.

The Order encoding accepts the following options:

Property

Type

Description

aggregate

Aggregate

Aggregation function for the field (e.g., "mean", "sum", "median", "min", "max", "count").

Default value: undefined (None)

See also: aggregate documentation.

bin

anyOf(boolean, BinParams, [‘binned’], null)

A flag for binning a quantitative field, an object defining binning parameters, or indicating that the data for x or y channel are binned before they are imported into Vega-Lite ("binned").

  • If true, default binning parameters will be applied.

  • If "binned", this indicates that the data for the x (or y) channel are already binned. You can map the bin-start field to x (or y) and the bin-end field to x2 (or y2). The scale and axis will be formatted similar to binning in Vega-Lite. To adjust the axis ticks based on the bin step, you can also set the axis’s tickMinStep property.

Default value: false

See also: bin documentation.

field

Field

Required. A string defining the name of the field from which to pull a data value or an object defining iterated values from the repeat operator.

See also: field documentation.

Notes:

  1. Dots (.) and brackets ([ and ]) can be used to access nested objects (e.g., "field": "foo.bar" and "field": "foo['bar']"). If field names contain dots or brackets but are not nested, you can use \\ to escape dots and brackets (e.g., "a\\.b" and "a\\[0\\]"). See more details about escaping in the field documentation.

  2. field is not required if aggregate is count.

sort

SortOrder

The sort order. One of "ascending" (default) or "descending".

timeUnit

anyOf(TimeUnit, TimeUnitParams)

Time unit (e.g., year, yearmonth, month, hours) for a temporal field. or a temporal field that gets casted as ordinal.

Default value: undefined (None)

See also: timeUnit documentation.

title

anyOf(Text, null)

A title for the field. If null, the title will be removed.

Default value: derived from the field’s name and transformation function (aggregate, bin and timeUnit). If the field has an aggregate function, the function is displayed as part of the title (e.g., "Sum of Profit"). If the field is binned or has a time unit applied, the applied function is shown in parentheses (e.g., "Profit (binned)", "Transaction Date (year-month)"). Otherwise, the title is simply the field name.

Notes:

  1. You can customize the default field title format by providing the fieldTitle property in the config or fieldTitle function via the compile function’s options.

  2. If both field definition’s title and axis, header, or legend title are defined, axis/header/legend title will be used.

type

StandardType

The encoded field’s type of measurement ("quantitative", "temporal", "ordinal", or "nominal"). It can also be a "geojson" type for encoding ‘geoshape’.

Note:

  • Data values for a temporal field can be either a date-time string (e.g., "2015-03-07 12:32:17", "17:01", "2015-03-16". "2015") or a timestamp number (e.g., 1552199579097).

  • Data type describes the semantics of the data rather than the primitive data types (number, string, etc.). The same primitive data type can have different types of measurement. For example, numeric data can represent quantitative, ordinal, or nominal data.

  • When using with bin, the type property can be either "quantitative" (for using a linear bin scale) or "ordinal" (for using an ordinal bin scale).

  • When using with timeUnit, the type property can be either "temporal" (for using a temporal scale) or "ordinal" (for using an ordinal scale).

  • When using with aggregate, the type property refers to the post-aggregation data type. For example, we can calculate count distinct of a categorical field "cat" using {"aggregate": "distinct", "field": "cat", "type": "quantitative"}. The "type" of the aggregate output is "quantitative".

  • Secondary channels (e.g., x2, y2, xError, yError) do not have type as they have exactly the same type as their primary channels (e.g., x, y).

See also: type documentation.

Binning and Aggregation

Beyond simple channel encodings, Altair’s visualizations are built on the concept of the database-style grouping and aggregation; that is, the split-apply-combine abstraction that underpins many data analysis approaches.

For example, building a histogram from a one-dimensional dataset involves splitting data based on the bin it falls in, aggregating the results within each bin using a count of the data, and then combining the results into a final figure.

In Altair, such an operation looks like this:

alt.Chart(cars).mark_bar().encode(
    alt.X('Horsepower', bin=True),
    y='count()'
    # could also use alt.Y(aggregate='count', type='quantitative')
)

Notice here we use the shorthand version of expressing an encoding channel (see Encoding Shorthands) with the count aggregation, which is the one aggregation that does not require a field to be specified.

Similarly, we can create a two-dimensional histogram using, for example, the size of points to indicate counts within the grid (sometimes called a “Bubble Plot”):

alt.Chart(cars).mark_point().encode(
    alt.X('Horsepower', bin=True),
    alt.Y('Miles_per_Gallon', bin=True),
    size='count()',
)

There is no need, however, to limit aggregations to counts alone. For example, we could similarly create a plot where the color of each point represents the mean of a third quantity, such as acceleration:

alt.Chart(cars).mark_circle().encode(
    alt.X('Horsepower', bin=True),
    alt.Y('Miles_per_Gallon', bin=True),
    size='count()',
    color='average(Acceleration):Q'
)

In addition to count and average, there are a large number of available aggregation functions built into Altair; they are listed in the following table:

Aggregate

Description

Example

argmin

An input data object containing the minimum field value.

N/A

argmax

An input data object containing the maximum field value.

N/A

average

The mean (average) field value. Identical to mean.

Line Chart with Layered Aggregates

count

The total count of data objects in the group.

Simple Heatmap

distinct

The count of distinct field values.

N/A

max

The maximum field value.

Box Plot with Min/Max Whiskers

mean

The mean (average) field value.

Interactive Scatter Plot and Linked Layered Histogram

median

The median field value

Box Plot with Min/Max Whiskers

min

The minimum field value.

Box Plot with Min/Max Whiskers

missing

The count of null or undefined field values.

N/A

q1

The lower quartile boundary of values.

Box Plot with Min/Max Whiskers

q3

The upper quartile boundary of values.

Box Plot with Min/Max Whiskers

ci0

The lower boundary of the bootstrapped 95% confidence interval of the mean.

Sorted Error Bars showing Confidence Interval

ci1

The upper boundary of the bootstrapped 95% confidence interval of the mean.

Sorted Error Bars showing Confidence Interval

stderr

The standard error of the field values.

N/A

stdev

The sample standard deviation of field values.

N/A

stdevp

The population standard deviation of field values.

N/A

sum

The sum of field values.

Streamgraph

valid

The count of field values that are not null or undefined.

N/A

values

??

N/A

variance

The sample variance of field values.

N/A

variancep

The population variance of field values.

N/A

Encoding Shorthands

For convenience, Altair allows the specification of the variable name along with the aggregate and type within a simple shorthand string syntax. This makes use of the type shorthand codes listed in Encoding Data Types as well as the aggregate names listed in Binning and Aggregation. The following table shows examples of the shorthand specification alongside the long-form equivalent:

Shorthand

Equivalent long-form

x='name'

alt.X('name')

x='name:Q'

alt.X('name', type='quantitative')

x='sum(name)'

alt.X('name', aggregate='sum')

x='sum(name):Q'

alt.X('name', aggregate='sum', type='quantitative')

x='count():Q'

alt.X(aggregate='count', type='quantitative')

Ordering marks

The order option and Order channel can sort how marks are drawn on the chart.

For stacked marks, this controls the order of components of the stack. Here, the elements of each bar are sorted alphabetically by the name of the nominal data in the color channel.

import altair as alt
from vega_datasets import data

barley = data.barley()

alt.Chart(barley).mark_bar().encode(
    x='variety:N',
    y='sum(yield):Q',
    color='site:N',
    order=alt.Order("site", sort="ascending")
)

The order can be reversed by changing the sort option to descending.

import altair as alt
from vega_datasets import data

barley = data.barley()

alt.Chart(barley).mark_bar().encode(
    x='variety:N',
    y='sum(yield):Q',
    color='site:N',
    order=alt.Order("site", sort="descending")
)

The same approach works for other mark types, like stacked areas charts.

import altair as alt
from vega_datasets import data

barley = data.barley()

alt.Chart(barley).mark_area().encode(
    x='variety:N',
    y='sum(yield):Q',
    color='site:N',
    order=alt.Order("site", sort="ascending")
)

For line marks, the order channel encodes the order in which data points are connected. This can be useful for creating a scatterplot that draws lines between the dots using a different field than the x and y axes.

import altair as alt
from vega_datasets import data

driving = data.driving()

alt.Chart(driving).mark_line(point=True).encode(
    alt.X('miles', scale=alt.Scale(zero=False)),
    alt.Y('gas', scale=alt.Scale(zero=False)),
    order='year'
)

Sorting

Specific channels can take a sort property which determines the order of the scale being used for the channel. There are a number of different sort options available:

  • sort='ascending' (Default) will sort the field’s value in ascending order. for string data, this uses standard alphabetical order.

  • sort='descending' will sort the field’s value in descending order

  • passing the name of an encoding channel to sort, such as "x" or "y", allows for sorting by that channel. An optional minus prefix can be used for a descending sort. For example sort='-x' would sort by the x channel in descending order.

  • passing a list to sort allows you to explicitly set the order in which you would like the encoding to appear

  • passing a EncodingSortField class to sort allows you to sort an axis by the value of some other field in the dataset.

Here is an example of applying these five different sort approaches on the x-axis, using the barley dataset:

import altair as alt
from vega_datasets import data

barley = data.barley()

base = alt.Chart(barley).mark_bar().encode(
    y='mean(yield):Q',
    color=alt.Color('mean(yield):Q', legend=None)
).properties(width=100, height=100)

# Sort x in ascending order
ascending = base.encode(
    alt.X(field='site', type='nominal', sort='ascending')
).properties(
    title='Ascending'
)

# Sort x in descending order
descending = base.encode(
    alt.X(field='site', type='nominal', sort='descending')
).properties(
    title='Descending'
)

# Sort x in an explicitly-specified order
explicit = base.encode(
    alt.X(field='site', type='nominal',
          sort=['Duluth', 'Grand Rapids', 'Morris',
                'University Farm', 'Waseca', 'Crookston'])
).properties(
    title='Explicit'
)

# Sort according to encoding channel
sortchannel = base.encode(
    alt.X(field='site', type='nominal',
          sort='y')
).properties(
    title='By Channel'
)

# Sort according to another field
sortfield = base.encode(
    alt.X(field='site', type='nominal',
          sort=alt.EncodingSortField(field='yield', op='mean'))
).properties(
    title='By Yield'
)

alt.concat(
    ascending, descending, explicit,
    sortchannel, sortfield,
    columns=3
)

The last two charts are the same because the default aggregation (see Binning and Aggregation) is mean. To highlight the difference between sorting via channel and sorting via field consider the following example where we don’t aggregate the data:

import altair as alt
from vega_datasets import data

barley = data.barley()
base = alt.Chart(barley).mark_point().encode(
    y='yield:Q',
).properties(width=200)

# Sort according to encoding channel
sortchannel = base.encode(
    alt.X(field='site', type='nominal',
          sort='y')
).properties(
    title='By Channel'
)

# Sort according to another field
sortfield = base.encode(
    alt.X(field='site', type='nominal',
          sort=alt.EncodingSortField(field='yield', op='min'))
).properties(
    title='By Min Yield'
)
sortchannel | sortfield

By passing a EncodingSortField class to sort we have more control over the sorting process.

Sorting Legends

While the above examples show sorting of axes by specifying sort in the X and Y encodings, legends can be sorted by specifying sort in the Color encoding:

alt.Chart(barley).mark_rect().encode(
    alt.X('mean(yield):Q', sort='ascending'),
    alt.Y('site:N', sort='descending'),
    alt.Color('site:N',
        sort=['Morris', 'Duluth', 'Grand Rapids',
              'University Farm', 'Waseca', 'Crookston']
    )
)

Here the y-axis is sorted reverse-alphabetically, while the color legend is sorted in the specified order, beginning with 'Morris'.