YAML Selectors
Write resource selectors in YAML, save them with a human-friendly name, and reference them using the --selector
flag.
By recording selectors in a top-level selectors.yml
file:
- Legibility: complex selection criteria are composed of dictionaries and arrays
- Version control: selector definitions are stored in the same git repository as the dbt project
- Reusability: selectors can be referenced in multiple job definitions, and their definitions are extensible (via YAML anchors)
Selectors live in a top-level file named selectors.yml
. Each must have a name
and a definition
, and can optionally define a description
and default
flag.
selectors:
- name: nodes_to_joy
definition: ...
- name: nodes_to_a_grecian_urn
description: Attic shape with a fair attitude
default: true
definition: ...
Definitions
Each definition
is comprised of one or more arguments, which can be one of the following:
- CLI-style: strings, representing CLI-style arguments
- Key-value: pairs in the form
method: value
- Full YAML: fully specified dictionaries with items for
method
,value
, operator-equivalent keywords, and support forexclude
Use the union
and intersection
operator-equivalent keywords to organize multiple arguments.
CLI-style
definition:
'tag:nightly'
This simple syntax supports use of the +
, @
, and *
graph operators, but it does not support set operators or exclude
.
Key-value
definition:
tag: nightly
This simple syntax does not support any graph or set operators or exclude
.
Full YAML
This is the most thorough syntax, which can include the operator-equivalent keywords for graph and set operators.
Review methods for the available list.
definition:
method: tag
value: nightly
# Optional keywords map to the `+` and `@` graph operators:
children: true | false
parents: true | false
children_depth: 1 # if children: true, degrees to include
parents_depth: 1 # if parents: true, degrees to include
childrens_parents: true | false # @ operator
indirect_selection: eager | cautious | buildable | empty # include all tests selected indirectly? eager by default
The *
operator to select all nodes can be written as:
definition:
method: fqn
value: "*"
Exclude
The exclude
keyword is only supported by fully-qualified dictionaries.
It may be passed as an argument to each dictionary, or as
an item in a union
. The following are equivalent:
- method: tag
value: nightly
exclude:
- "@tag:daily"
- union:
- method: tag
value: nightly
- exclude:
- method: tag
value: daily
Note: The exclude
argument in YAML selectors is subtly different from
the --exclude
CLI argument. Here, exclude
always returns a set difference,
and it is always applied last within its scope.
When more than one "yeslist" (--select
) is passed, they are treated as a union rather than an intersection. Same thing when there is more than one "nolist" (--exclude
).
Indirect selection
As a general rule, dbt will indirectly select all tests if they touch any resource that you're selecting directly. We call this "eager" indirect selection. You can optionally switch the indirect selection mode to "cautious", "buildable", or "empty" by setting indirect_selection
for a specific criterion:
- union:
- method: fqn
value: model_a
indirect_selection: eager # default: will include all tests that touch model_a
- method: fqn
value: model_b
indirect_selection: cautious # will not include tests touching model_b
# if they have other unselected parents
- method: fqn
value: model_c
indirect_selection: buildable # will not include tests touching model_c
# if they have other unselected parents (unless they have an ancestor that is selected)
- method: fqn
value: model_d
indirect_selection: empty # will include tests for only the selected node and ignore all tests attached to model_d
If provided, a YAML selector's indirect_selection
value will take precedence over the CLI flag --indirect-selection
. Because indirect_selection
is defined separately for each selection criterion, it's possible to mix eager/cautious/buildable/empty modes within the same definition, to achieve the exact behavior that you need. Remember that you can always test out your critiera with dbt ls --selector
.
See test selection examples for more details about indirect selection.
Example
Here are two ways to represent:
$ dbt run --select @source:snowplow,tag:nightly models/export --exclude package:snowplow,config.materialized:incremental export_performance_timing
- CLI-style
- Full YML
selectors:
- name: nightly_diet_snowplow
description: "Non-incremental Snowplow models that power nightly exports"
definition:
# Optional `union` and `intersection` keywords map to the ` ` and `,` set operators:
union:
- intersection:
- '@source:snowplow'
- 'tag:nightly'
- 'models/export'
- exclude:
- intersection:
- 'package:snowplow'
- 'config.materialized:incremental'
- export_performance_timing
selectors:
- name: nightly_diet_snowplow
description: "Non-incremental Snowplow models that power nightly exports"
definition:
# Optional `union` and `intersection` keywords map to the ` ` and `,` set operators:
union:
- intersection:
- method: source
value: snowplow
childrens_parents: true
- method: tag
value: nightly
- method: path
value: models/export
- exclude:
- intersection:
- method: package
value: snowplow
- method: config.materialized
value: incremental
- method: fqn
value: export_performance_timing
Then in our job definition:
dbt run --selector nightly_diet_snowplow
Default
Selectors may define a boolean default
property. If a selector has default: true
, dbt will use this selector's criteria when tasks do not define their own selection criteria.
Let's say we define a default selector that only selects resources defined in our root project:
selectors:
- name: root_project_only
description: >
Only resources from the root project.
Excludes resources defined in installed packages.
default: true
definition:
method: package
value: <my_root_project_name>
If I run an "unqualified" command, dbt will use the selection criteria defined in root_project_only
—that is, dbt will only build / freshness check / generate compiled SQL for resources defined in my root project.
dbt build
dbt source freshness
dbt docs generate
If I run a command that defines its own selection criteria (via --select
, --exclude
, or --selector
), dbt will ignore the default selector and use the flag criteria instead. It will not try to combine the two.
dbt run --select "model_a"
dbt run --exclude model_a
Only one selector may set default: true
for a given invocation; otherwise, dbt will return an error. You may use a Jinja expression to adjust the value of default
depending on the environment, however:
selectors:
- name: default_for_dev
default: "{{ target.name == 'dev' | as_bool }}"
definition: ...
- name: default_for_prod
default: "{{ target.name == 'prod' | as_bool }}"
definition: ...
Selector inheritance
Selectors can reuse and extend definitions from other selectors, via the selector
method.
selectors:
- name: foo_and_bar
definition:
intersection:
- tag: foo
- tag: bar
- name: foo_bar_less_buzz
definition:
intersection:
# reuse the definition from above
- method: selector
value: foo_and_bar
# with a modification!
- exclude:
- method: tag
value: buzz
Note: While selector inheritance allows the logic from another selector to be reused, it doesn't allow the logic from that selector to be modified by means of parents
, children
, indirect_selection
, and so on.
The selector
method returns the complete set of nodes returned by the named selector.
Difference between --select
and --selector
In dbt, select
and selector
are related but distinct concepts used for choosing specific models, tests, or resources. The following tables explains the differences and when to best use them:
Feature | --select | --selector |
---|---|---|
Definition | Ad-hoc, specified directly in the command. | Pre-defined in selectors.yml file. |
Usage | One-time or task-specific filtering. | Reusable for multiple executions. |
Complexity | Requires manual entry of selection criteria. | Can encapsulate complex logic for reuse. |
Flexibility | Very flexible but less reusable. Allows using graph operators (such as + , @ .) and set operators. | Slightly less flexible; focuses on reusable and structured logic, supports YAML-based union , intersection , and exclude . |
Example | dbt run --select my_model+ (runs my_model and all downstream dependencies with the + operator). | dbt run --selector nightly_diet_snowplow (runs models defined by the nightly_diet_snowplow selector in selectors.yml ). |
For additional examples, check out this GitHub Gist.